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Turbines and turbulence 


Some legitimate questions have been raised over the green credentials of wind turbines. Politics 


must not block research where it is needed. 


ill wind turbines wreck the environment? Last month, 
Wi South China Morning Post published a news story that 

contained a thinly veiled attack on China's wind industry. 
The article cited herdsmen in a village in Inner Mongolia who say rain 
stopped falling after the establishment of a nearby wind farm, and 
meteorologists who backed up the observation with a few years’ data 
that show low precipitation. The article also quoted an engineer in the 
government's renewable-energy department who hastily dismissed 
concern over the effect of wind farms, refused to acknowledge the 
need for research, and asserted the overarching necessity for China 
to develop wind energy. The article concludes that “wind power is not 
completely green”. There have been similar attacks on wind energy in 
Texas and elsewhere. 

It is good to see that the newspaper, Hong Kong's most prominent 
English-language daily, retains a critical stance towards the Chinese 
government under the ‘one country, two systems’ policy, and is willing 
to put Chinese officials on the spot. But in this case, the dismissive offi- 
cial quoted probably has a point. There is no solid scientific evidence 
that wind turbines can trigger major changes in rainfall. And given 
Nature’s conversations with atmospheric modellers outside China, 
people are not likely to find any. One expert said the idea that a wind 
farm could have such a dramatic and demonstrable effect was “silly”. 

Wind farms, however, may affect regional or global environmental 
systems — although to suggest this can draw rapid scorn from wind- 
power proponents. In 2004, the environmental engineer and atmos- 
pheric modeller Somnath Baidya Roy, then at Princeton University in 
New Jersey, published work showing turbulence created by turbines 
would, among other effects, lead to vertical mixing of energy and heat 
in atmospheric layers that would affect local temperatures, and pos- 
sibly change evaporation patterns (S. B. Roy et al. J. Geophys. Res. 109, 
D19101; 2004). Some took his study as an attack on the wind industry, 
and he was besieged with nasty e-mails. They questioned his sanity, 
threatened to get him fired from his post at Princeton, and accused 
him of being a pawn of the coal or oil industries. (He has never had nor 
sought any industrial ties.) The president of one US-based wind-farm 
firm told Roy to consider “how much heat is your head turning out, 
while you consider such thoughts?” and to ponder many other factors 
“while checking your navel for lint”. (We know this because Roy con- 
sidered the comments humorous enough to post on his webpage.) 

At around the same time, other scientists used models to suggest 
that wind turbines could have effects on climate change and suggested 
that estimates of these effects should be balanced against their green 
benefits. Although these researchers are seen by some in the industry 
as overly critical, they concluded with no stronger recommendation 
than a call for more research. 

In October, Roy, now at the University of Illinois at Urbana- 
Champaign, published data to back up his theoretical work (S. B. Roy 
and J. J. Traiteur Proc. Natl Acad. Sci. USA 107, 17899-17904; 2010). 


A 25-year data set showed a significant effect of wind farms on near- 
surface temperatures. Roy suggested in the paper that those construct- 
ing wind farms should consider low-turbulence turbines or use the 
results to help find the most suitable sites. It hardly constituted an 
attack on wind energy. In fact, he says, the main impact — a raising of 

surface temperatures at night and lowering 


“Data showed during the day — could benefit agriculture 
asignificant by decreasing frost damage and extending 
effect of wind the growing season. Many farmers already 
farms on do this with air circulators. 


Roy’s study was on wind farms with some 
20 turbines. Local effects will be more marked 
in much larger farms. Roy hopes to start a 
field campaign that can monitor energy fluxes, evaporation, humidity 
and temperature on a variety of farms as they scale up. 

China, developing huge wind farms and planning more, should take 
a prominent role in such studies. As its facilities expand, it can make 
solid scientific assessments, which could contribute to a more rational 
and beneficial use of wind. Although the Chinese official may have been 
right to dismiss the suggested effect on rainfall, his government 
should not ignore the need for wider research on the impact of its 
wind revolution. = 


near-surface 
temperatures.” 


Assessment time 


Italy’s proposed university reform must be 
linked to performance. 


of those present were focusing on the plight of Italy’s under- 

funded and underperforming universities, which face major 
reform. There is no doubt that reform is needed. The question is 
whether the government will deliver it correctly. 

Islands of excellence exist in Italian universities, particularly in 
the north of the country. And they survive despite such low levels of 
government investment that little cash remains for infrastructure or 
research once salaries have been paid. But malaise is widespread, and 
money is not the only question. University workforces are riddled 
with dead wood, a legacy of too little competition for academic posts 
or research grants. And universities are not penalized if they choose to 
hire staff on the basis of personal contacts instead of talent. 

A controversial new law, expected to be approved this week, attempts 
to fix these issues. It is imperfect, but if implemented properly, it will 
give Italy’s universities a brighter future. Critical to its implementation, 


A s Rome burned last week during anti-government riots, many 
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though, is the prompt creation of a long-promised evaluation agency 
to assess teaching and research performance and link them to univer- 
sity budgets. Also critical is money — just as throwing money at the 
problem won't solve the malaise on its own, reforms without additional 
funds wont be effective. 

A law to reform universities was drafted in 2007 by the previous, 
centre-left government, which also proposed setting up an evaluation 
agency, known as ANVUR (National Agency for the Evaluation of the 
University and Research System), modelled on France's AERES agency. 
The current, centre-right government picked up and tweaked that draft. 
In doing so, it inserted the authority of its powerful finance ministry, 
which will directly manage some funds, and sign off annual budgets 
and budget proposals for each university. But the law also introduces 
some radical changes that could improve things. For example, it brings 
in mandatory peer review of all public research money, requiring that 
30% of individuals who sit on peer-review committees are working 
abroad. This will help to avert high-profile debacles like the minis- 
try of health’s behind-closed-doors allocation in 2007 of a €3-million 
(US$4-million) grant from its stem-cell research fund to scientists at 
a private foundation who claimed to be working more ethically than 
others — and the reversal of that decision following public outcry. 

Changes in the system for recruiting staff may also help, but not 
necessarily. Traditionally, academic staff have been selected by 
national committees and then allocated to universities to fill relevant 
vacancies. Incomprehensible to many of those in other countries, 
where universities choose their own staff, the ‘concorsi’ system was 
intended to challenge a tendency to recruit locally, without necessarily 


choosing the best. But behind-the-scenes dealing among concorsi 
committees ensured that universities mostly got the candidates they 
wanted anyway, for good or bad. Extensive tinkering in the past decade 
or so has not yet found a better balance between quality control at a 
national level and local university autonomy. In the new system, all 
candidates who pass a national qualification exam, judged by commit- 

tees similar to concorsi committees, will join 


“Italians are a national list from which a university may at 
familiar with any time select a candidate. The danger here 
fine-sounding is that less academically suitable people may 
reforms that get on the list, because — as there is no link to 
fail to actually a concrete academic position — committees 


don't bear responsibility for their choices. 

Italians are familiar with fine-sounding 
reforms, such as the attempts to improve the concorsi system, that fail 
to actually change things. They enjoy quoting Giuseppe Tomasi di 
Lampedusas The Leopard, a novel set around the time of Italy’s unifica- 
tion in 1861, in which a protagonist observes contemporary politics, 
and wryly notes how the newly empowered try to ‘change everything, 
so that everything remains the same’ But this law has a strong chance 
of changing things so that they do become different — and better. 
A crucial foundation for such success is that the government makes 
ANVUR happen soon. It was, after all, founded in law in February 
this year. Now, Italian scientists must see it built in bricks and mortar. 
The system needs more money, but that money must be linked to 
performance. Establishing ANVUR would show that Italy has placed 
its university system on the road to true reform. m 


change things.” 


Calm in a crisis 


Jane Lubchenco, Nature’s Newsmaker of the 
Year, shows how scientists can help society. 


erupted into the Gulf of Mexico and disgorged nearly 5 million 

barrels of petroleum. Throughout the crisis, a poised scientist 
gave countless media interviews to explain to a scared and angry public 
how the US government was striving to contain the damage. Behind 
the scenes, with decisive leadership, she ran the National Oceanic and 
Atmospheric Administration (NOAA) — the agency that closed fish- 
eries, tracked oil, protected habitats and assessed the damage to com- 
munities and the environment. For her role in the response to the crisis, 
Jane Lubchenco is Nature’s Newsmaker of the Year (see page 1024). 

Before becoming NOAA administrator in 2009, Lubchenco had a 
reputation as both a leading researcher and an environmental advo- 
cate. She made important advances in the basic science of coastal 
ecology and helped to raise awareness of the many threats to ocean 
ecosystems around the world. Lubchenco is now reorienting her 
US$4.7-billion federal agency to strengthen the science and policies 
that protect US marine resources. 

The United States could do with more scientists like Lubchenco, with 
the skills and the dedication to speak out on issues that matter. The need 
will be particularly acute next year, when the Republican Party takes over 
the US House of Representatives. Although Republicans have generally 
supported basic science, incoming House leaders have made it clear that 
they are hostile to certain areas of research. Some have pledged to hold 
hearings on climate science, which they argue is seriously flawed and 
has overstated the evidence for global warming. Adrian Smith (Repub- 
lican, Nebraska) introduced the YouCut Citizen Review, which calls on 
the US public to search the National Science Foundation website list of 
peer-reviewed grants for those they consider wasteful. And Darrell Issa 
(Republican, California), the incoming head of the powerful Committee 


ee almost three months this year, a mini-volcano of oil and gas 
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on Oversight and Government Reform, last year led an effort to revoke 
funding from the National Institutes of Health for studies of substance 
abuse and HIV risk in other countries (see Nature 460, 667; 2009). 

Scientific leaders in the United States must stand up against such 
attacks. As a first step, they should try to meet with incoming House 
members from both parties to voice their concerns and explain the 
rationale behind research in controversial areas. Recognizing that all 
politics is local, scientists will need to make clear why climate change 
or HIV research matters for the communities represented by members 
of Congress. They should take along science-savvy business leaders 
and locally elected officials to help make their case. 

Beyond the scientific leadership, there is a broader need for more 
individual scientists to communicate with the public. Currently, that 
kind of activity is not particularly valued — and is even disdained — in 
some fields of research. And spending time meeting with elected lead- 
ers or local journalists does not help a young scientist to get tenure. 

Most scientists receive no training in public communication, and will 
need to hone their skills. Some can learn from experienced mentors; 
others can benefit from programmes developed by scientific societies 
and other groups (see page 1032). Members of academic and govern- 
ment agencies can consult with public-affairs representatives, who can 
show them the best ways to communicate the results and implications 
of research. Another avenue is the Congressional Science Fellowship 
programme, through which scientific societies can sponsor scientists 
to work in congressional offices for a year, providing advice to elected 
officials. The societies involved should expand their programmes, and 
groups that do not currently sponsor fellows should consider it. 

As with any endeavour, it takes time to develop the communication 
skills that Lubchenco and other senior scientists have acquired. Even 
Lubchenco foundered at times during the oil spill. She made some 
mistakes and was criticized for the way that her agency initially down- 
played the evidence for oil spreading below the surface. Despite such 
slips, Lubchenco has steered her agency through 
the crisis with a steady hand. She is an outstand- 
ing example of how much one scientist can do 
to improve both society and natural ecosystems. 
Others would do well to follow her lead. m 
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our colleagues on the other side of the two-culture divide. 

A number of US universities have recently drastically cut or 
closed their programmes in arts and the humanities. Departments 
of classics, French, Russian, German, American studies, theatre arts, 
philosophy, Italian and European literature have all suffered. To 
borrow a phrase from Marx (Karl, not Groucho), a spectre is haunting 
higher education: the spectre of the market. 

Similar stories of cutbacks in non-science subjects have emerged 
from France, Canada, Australia, New Zealand and other countries long 
known for the strength of their higher-education systems. In the United 
Kingdom, there is deep concern that the humanities are at serious risk in 
the new education budget announced in October by chancellor George 
Osborne. Excluding research support, which, 
Osborne said, will remain flat “to ensure the UK 
remains a world leader in science and research’, 
the amount of money going to higher education 
in England will probably decline by 40% over the 
next four years. The government has said only that 
it will continue to pay for teaching in science, tech- 
nology, engineering and mathematics. 

Ifarts and humanities are to survive, we who 
work in the sciences need to stand up for them 
and alongside them. Why? We should proclaim 
not only our love for the humanities as educated 
people, but their crucial role in our lives as pro- 
fessional scientists. I learned to think critically, 
analyse deeply and write clearly in my university 
humanities courses, not in my science courses. I 
found humanities the most valuable subjects in 
school. They still broaden my thinking, help me 
to make connections and aid my ability to communicate. 

The humanities are the victim of two pernicious trends that have 
crept into the management of universities in the past decade or two, 
based on the idea that market forces should control what happens in 
education, as they are supposed to influence the economy. 

The first is that higher education is increasingly run as a business; 
anything that doesn't contribute positively to the bottom line of the 
balance sheet is reduced or eliminated. Helping to drive this trend are 
the disturbingly large number of institutions of higher learning that 
are headed by administrators recruited from the worlds of business or 
politics. Nothing could so undermine the mission of a university as the 
misguided principle that all parts of it must make a profit. Contrary to 
the prejudices of a number of administrators, there is evidence from 
recent studies, including one from the University 


A s we enter the season of goodwill, let us spare a thought for 


of California, Los Angeles, that artsandhumani- NATURE.COM 
ties departments can actually make a profit. But _ Discuss this article 
I don't think we should use that line — it’s fight- _ online at: 

ing on our opponent's ground. Anditisalsonot — go.ature.com/ce4ixm 


STUDENTS 


HAVE NEITHER THE 
WISDOM NOR THE 


EXPERIENCE to 


KNOW WHAT THEY 


NEED T0 
KNOW. 


Save university arts from 
the bean counters 


Scientists must reach across the divide and speak up for campus colleagues in 
arts and humanities departments, says Gregory Petsko. 


clear that all science and engineering programmes make money. A 
better argument is that profit and loss should not be the chief basis for 
important academic decisions. 

The second damaging trend is the growing mantra of student choice, 
which increasingly dictates what programmes are offered, expanded 
and supported. The thinking here is that students are consumers, and 
market forces will lead to efficiencies in education, just as they do in, say, 
finance. If the past two years have taught us anything, it’s that markets 
arent always efficient. In fact they can be manipulated, driven by emo- 
tional frenzy and subject to fads. Besides, there are things that simply 
shouldnt be left to the brutality of the invisible hand. Education is one. 

Moreover, the idea that student choice is a good thing is wrong, 
whether one believes in markets or not. Students have neither the 
wisdom nor the experience to know what they 
need to know. Left to themselves, they frequently 
choose subjects based on the fashion of the 
moment (which in the United States is currently 
economics, although at one time it was sociol- 
ogy) or on what they think will equip them best 
for a job. That the best and most valuable educa- 
tion combines breadth with depth is something 
that most students do not yet understand. We 
need less student choice, not more. We need 
more prescribed curricula, not less. 

To reverse these trends, here are some spe- 
cific suggestions for things we might do. First, 
we should affirm the principle that universities 
aren't just about discovering new knowledge or 
generating intellectual property; they are also 
supposed to preserve ideas and information that 
may seem out of date now, but that are bound to 
become important in the future, as ‘old’ ideas always do. 

Second, we have to fight the hegemony of the bean counters. Uni- 
versities should be run by people who understand what universities are 
really about. Marx (still Karl) also remarked that, for the bureaucrat, 
the world is a mere object for him to manipulate. Bureaucrats see uni- 
versities the same way. And third, we must leave the comfortable ivory 
towers of our laboratories and take a stand with those in higher educa- 
tion — be they faculty members or administrators — who oppose the 
tyranny of the market. There is only one market that has any place in 
higher education: the marketplace of ideas. 

To borrow from Marx again (Groucho this time, not Karl), those 
who run universities have had some perfectly wonderful ideas, but to 
savage arts and humanities education is not one of them. If you feel 
the same way, please speak out. m 


Gregory Petsko is professor of biochemistry and chemistry at 
Brandeis University in Waltham, Massachusetts. 
e-mail: petsko@brandeis.edu 
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Pe ecouocy, 
Hotter climate, 
altered breeding 


As Earth warms, amphibians 
are shifting their breeding 
times at unprecedented rates. 
Four out of ten amphibian 
species studied at a South 
Carolina wetland either 
delayed or advanced their 
breeding — depending on 
their breeding season — by 
15.3-76.4 days over a 30-year 
period. For two of the species, 
Ambystoma opacum and 
Eurycea quadridigitata, 
this coincided with a 1.2°C 
increase in overnight air 
temperature during their pre- 
breeding and breeding periods. 
Brian Todd at the University 
of California, Davis, and 
his team say that the altered 
breeding times, which 
range from 5.9 to 37.2 days 
per decade, are among the 
greatest rates of change seen 
in ecological life-cycle events. 
The changes could affect 
the dynamics of the larger 
amphibian larval community, 
including resource availability 
and predation rates. 
Proc. R. Soc. B doi:10.1098/ 
rspb.2010.1768 (2010) 


Magnetic gel 
delivers drugs 


Drugs and cells can be 
delivered on demand bya 
porous material engineered 
to compress in response to an 


applied magnetic field. 
David Mooney at Harvard 
University in Cambridge, 
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Selections from the 
scientific literature 


ZOOLOGY 


Snail shells spread light around 


A marine snail has a shell that is remarkably 
well adapted for diffusing the light that it emits 


to ward off predators. 


Stimulating Hinea brasiliana snails (pictured 
left), by tapping them or placing them in contact 
with potential predators, causes them to emit a 
blue-green light from defined areas of their body, 
report Dimitri Deheyn and Nerida Wilson of 
the Scripps Institution of Oceanography in San 
Diego, California. The shell directly transmits 


Massachusetts, and his team 
prepared an alginate-based 
gel with micrometre-sized 
pores, and paramagnetic iron 
nanoparticles embedded 
throughout. On exposure 

to a magnetic field, the 
nanoparticles put the squeeze 
on the ferrogel. The authors 
used this to release a drug 
payload in in vitro experiments 
and, by implanting the gel into 
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most wavelengths of light, with the exception of 
blue-green ones. These are instead spread by the 


shell from the limited production regions of the 


mice, for localized release of 
dye-stained stem cells. 

With a reversible volume 
reduction of more than 70% 
(pictured), such ferrogels 
may also find applications 
as actuators and sensors in 
biomedical applications. 
Proc. Natl Acad. Sci. USA 
doi:10.1073/pnas.1007862108 
(2010) 


DNA from across 
the ocean 


A handful of Icelanders may 
be descendents of a Native 

American woman ferried to 
the island hundreds of years 


snail's body over a much larger area (right). 

The shell produced brighter and larger areas 
of diffused light than a commercial diffuser. 
Such shells allow snails to produce visible and 
extensive bioluminescent signals from their 
protected position inside the shell. 

Proc. R. Soc. B doi:10.1098/rspb.2010.2203 (2010) 


before Christopher Columbus 
reached the New World. 

Ina tiny proportion of the 
country’s residents, DNA 
sequences from cell organelles 
called mitochondria (mtDNA) 
resemble those of some Native 
Americans. Unlike nuclear 
DNA, mtDNA is inherited 
only from the mother. 

Sigridur Sunna 
Ebenesersdottir at d@CODE 
Genetics in Reykjavik and her 
colleagues traced the sequence 
variants back to four Icelanders 
born in the early 1700s. 
However, genetic differences 
between them suggest that the 
mtDNA derived from a woman 
who arrived in Iceland much 
earlier — possibly around 


R. SOC. 


ELSEVIER 


the time the Vikings started 
exploring the Americas in 
about AD 1000. Because Native 
American populations were 
decimated after the arrival of 
the Europeans, the lineage may 
be missing from contemporary 
populations. DNA analysis of 
the remains of ancient Native 
Americans could provide a 
more definitive link. 

Am. J. Phys. Anthropol. 144, 
92-99 (2011) 


Imaging grooves 
from glaciers 


Developments in radar 
technology have allowed 
geoscientists to ‘see through 
a Greenland glacier and 
construct three-dimensional 
topographic maps of its bed. 

Kenneth Jezek of Ohio 
State University in Columbus 
and his team used high- 
resolution radar tomography 
and synthetic-aperture radar 
data to measure ice thickness 
in a region of the Jakobshavn 
Glacier. They found that as the 
glacier slides over its bed, it 
cuts large-scale ridge-groove 
features into the bedrock that 
are similar to landforms found 
on deglaciated terrain. The 
orientation and dimensions 
of the grooves suggest that 
the glacier has been flowing 
persistently in the same 
direction. 

Understanding past glacier 
movement and bedrock 
geomorphology helps 
researchers to forecast climate- 
driven changes in the seaward 
flux of ice sheets. 

Geophys. Res. Lett. doi:10.1029/ 
2010GL045519 (2010) 


Tumours aided by 
immune cells 


Zebrafish cells with the 
propensity to give rise to 
tumours behave similarly to 
wounded tissue, and call for 
assistance from the immune 
system. So say Paul Martin 
at the University of Bristol, 
UK, and his colleagues, who 
imaged the interactions 


between the cells in real time. 

The authors expressed a 
cancer-associated mutant form 
of the Ras protein in zebrafish 
(Danio rerio). Because 
zebrafish larvae are translucent, 
the team was able to visualize 
fluorescently labelled immune 
cells as they responded to the 
transformed cells. 

Cells expressing mutant Ras, 
and their healthy neighbours, 
released hydrogen peroxide, 
attracting immune cells called 
neutrophils and macrophages, 
which tethered themselves to 
the transformed cells. Blocking 
hydrogen peroxide synthesis 
— and so the recruitment of 
the immune cells — slowed the 
proliferation of transformed 
cells, suggesting that early 
immune responses may 
support tumour development. 
PLoS Biol. 18, e1000562 (2010) 


| VISION SCIENCE 
Man or woman? 
Depends on view 


Whether a face looks like that 
ofa man or a woman depends 
on the part of the retina on 
which the image lands. 


Eleven volunteers were asked 
to identify the gender ofa series 
of faces (pictured) presented in 
one of eight possible visual- 
field locations relative to a 
central point. Arash A fraz at 
the Massachusetts Institute of 
Technology in Cambridge and 
his co-workers found that two 
identical faces were perceived 
to be of different gender if they 
were presented simultaneously 
in specific, different locations. 
Volunteers’ responses became 
more consistent across the 
visual field as the images grew 
in size. 

The researchers think that 
the perceptual variation may 
result from the small size of 
the stimuli relative to that of 
the receptive field. The small 
number of brain cells analysing 
the images at any given location 
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COMMUNITY 


CHOICE 


Better memory with less microRNA 


> HIGHLY READ 
on jneurosci.org 
in November 


Learning and memory in mice seem to 

be enhanced by the loss of small RNA 
molecules called microRNAs (miRNAs) in 
the brain. 


Witold Konopka at the German Cancer Research Center in 
Heidelberg and his colleagues deactivated the gene for Dicer, 
a key enzyme in miRNA synthesis, in forebrain neurons of 
adult mice. Twelve weeks later, the mice showed improved 
learning and memory in a behavioural test. This was mirrored 
by increased numbers of a type of dendritic spine in mutant 
neurons that is associated with learning. After 20 weeks, 
however, some of the neurons had degenerated, confirming 
the importance of microRNAs for neuronal survival. 


J. Neurosci. 30, 14835-14842 (2010) 


may have varying responses; 
these are averaged out by a 
larger image, which stimulates 
a greater number of cells. 

Curr. Biol. 20, 2112-2116 (2010) 


DEVELOPMENTAL BIOLOGY 


Immune system 
emerges in layers 


The human immune system 
develops in waves, the first 

of which begins even before 
birth. Fetal and adult T cells 
originate from different stem- 
cell populations, allowing the 
fetal immune system to better 
tolerate foreign antigens — 
namely the mother’s. 

Joseph McCune at the 
University of California, San 
Francisco, and his colleagues 
compared human fetal blood 
stem cells and T cells with those 
of adults. After implantation in 
mice that permit human blood- 
cell maturation, fetal stem cells 
were more likely than adult 
ones to develop into regulatory 
T cells. These suppress immune 
activity, enhancing tolerance to 
antigens. 

Fetal stem cells and T cells 
also had different gene- 
expression profiles from the 
adult versions of these cells. 
Statistical analysis revealed that 
developmental stage accounted 
for most of these differences. 
Science 330, 1695-1699 (2010) 


Neanderthal 
family tree 


Neanderthals living 49,000 
years ago may have abided in 
small clans banded together by 
their male kin. 

Carles Lalueza-Fox at 
Pompeu Fabra University in 
Barcelona, Spain, Antonio 
Rosas at the National Museum 
of Natural Sciences in Madrid 
and their colleagues analysed 
the remains of 12 Neanderthals. 
They sequenced certain regions 
of the mitochondrial DNA 
extracted from fragments of 
bones and teeth. 

The results showed that the 
group’s three adult males were 
close relatives, but the three 
adult females were not. The 
authors further inferred that 
an infant and two juveniles 
were offspring of two of the 
adult females. The team 
suggests that the individuals 
represent a social unit based 
on patrilocality, in which 
individuals live with the adult 
male’s family. 

Proc. Natl Acad. Sci. USA 
doi:10.1073/pnas.1011553108 
(2010) 
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SEVEN DAYS nescnni 


US policy flurry 


Ina fit of pre-Christmas 
legislating, the US Senate 
reauthorized a version of the 
America COMPETES Act, 
which would keep on track a 
series of budget increases for 
key science funding agencies, 
including the National Science 
Foundation. If this is signed 
into law, money for the 
increases would need to be 
found in the 2011 budget. Last 
week, the Senate also passed 
food-safety legislation, which 
would give the Food and 
Drug Administration broad 
new food-policing powers. 
Both bills were expected to 
be passed by the House of 
Representatives as Nature 
went to press. Another bill, 
passed by both houses, calls 
for an integrated national 
plan to overcome Alzheimer’s 
disease. 


Haiti cholera fight 
With the death toll from Haiti’s 
cholera epidemic passing 2,400, 
an expert meeting convened 

by the Pan American Health 
Organization on 17 December 
called for the use of cholera 
vaccines in the country, at least 
as a pilot project. It also urged 
the creation of an international 
stockpile of cholera vaccine — 
only about 100,000 doses 

are currently available for 
shipment. Separately, the 
United Nations secretary- 
general Ban Ki-moon 
announced an independent 
investigation into the source of 
the outbreak. 


Synthetic biology 
US research in synthetic 
biology should be overseen 

at White House level, but 

not over-regulated, said 

a presidential bioethics 
commission in a report 
published on 16 December. 
Claiming to navigate a middle 
road between unbridled 


Research sub set for rebirth 


bomb. Owned by the US Navy and operated 
by the Woods Hole Oceanographic Institution 
in Massachusetts, Alvin had its final dive 
(pictured) in its current form on 14 December; 
it will now be upgraded to have a larger crew 
compartment, manipulator arms and an 
advanced autopilot. See go.nature.com/5adtko 


The venerable Alvin submersible — which has 
enabled numerous historic discoveries since it 
was first launched in 1964 — is undergoing a 
US$40-million transformation. During its long 
life, Alvin has been involved in the discovery of 
deep-sea hydrothermal vents; taken humans 

to the wreckage of the Titanic for the first 

time; and helped to recover a lost hydrogen 


experimentation anda 
regulatory straitjacket, the 
commission said that the field 
should embrace “an ongoing 
process of prudent vigilance’, 
and did not call for new laws 
or changes to regulations. See 


go.nature.com/thidag for more. 


ITER squeezed 

The European Parliament 
has rejected a plan to close 

a funding gap in the budget 
of ITER, the €15-billion 
(US$19.7-billion) fusion 
reactor under construction 
near Cadarache, France. To 
cover a €1.4-billion shortfall 
in 2012-13, the European 
Commission had proposed 
using money from elsewhere 
in the European Union's 
budget, including research 
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for more. 


funds (see Nature 466, 171; 
2010). But on 15 December 
the parliament turned that 
down. Budget negotiations 
will now continue into 2011. 


Science of security 
The US Department of 
Defense should sponsor 
university research 
programmes in cybersecurity, 
according to a report by the 
JASON group, which advises 
the US government on defence 
science and technology. The 
November report — released 
by the Federation of American 
Scientists on 14 December 

— says cybersecurity as a 
discipline should be thought 
of as an applied science akin to 
medicine, which would benefit 
from rigorous experiments. 


UN biodiversity 

At its general assembly on 

20 December, the United 
Nations gave the final 
go-ahead for a body that 

will monitor global ecology, 
the Intergovernmental 
Science-Policy Platform on 
Biodiversity and Ecosystem 
Services (IPBES). It will 
operate much like the 
Intergovernmental Panel on 
Climate Change, conducting 
periodic assessments of Earth’s 
biodiversity and ecosystems 
services. 


Periodic-table shift 


Natural geographic variations 
in the abundance of a chemical 
element's isotopes should be 
noted on the periodic table, 
chemistry’s governing body, 


M. SCHROPE 


the International Union of 
Pure and Applied Chemistry, 
has decided. The decision 
means that ten common 
elements — including 
hydrogen, carbon and 
oxygen — will be assigned 
a range, rather than a single 
average number, for their 
atomic weight. Hydrogen 
will be [1.00784; 1.00811], 
for example, rather than 
[1.00794]. 


Integrity guidelines 


Long-awaited guidelines 

on scientific integrity in 
government were released on 
17 December by the White 
House Office of Science and 
Technology. See page 1009 
for more. 


Zerhouni move 

Elias Zerhouni (pictured), 
who directed the US National 
Institutes of Health (NIH) 
from 2002 to 2008, will head 
research and development at 


Public settlements paid by 


BUSINESS WATCH 
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pharmaceutical companies to US 
governments — both state and 
federal — for illegal behaviour 
climbed to US$14.8 billion over 
the past five years, according toa 


study released on 16 December by 
Public Citizen, a non-profit group 


in Washington DC. Illegal off- 
label promotion was responsible 
for the largest amount of federal 
penalties, and settlements under 
the False Claims Act — for 
activities such as inflating drug 
prices — now exceed those made 
by the defence industry. 


French pharmaceutical giant 
Sanofi-aventis. The company, 
headquartered in Paris, 
announced the appointment 
on 16 December. 


| BUSINESS 
Drug tug-of-war 


Swiss pharmaceutical 
company Roche intends to 
challenge the US Food and 
Drug Administration (FDA) 
over its 16 December decision 
to withdraw approval of the 
drug Avastin (bevacizumab) 
for the treatment of advanced 
breast cancer. The FDA’s 
announcement came five 
months after a panel of 
advisers decided that the drug's 
benefits did not outweigh 

its risks in patients with 

breast cancer. See go.nature. 
com/7milea for more. 


Carbon trading 


California’s Air Resources 
Board has approved 
regulations to create the 
United States’ largest market 
in carbon trading. From 

2012, the scheme will cap 
greenhouse-gas emissions 
from the state's electric utilities 
and heavy industry, allowing 
companies to trade emissions 
permits to reach their targets. 
Transportation fuels will be 
included by 2015. The system 
may later allow firms to reduce 
emissions elsewhere — such 
as by protecting forests in 
Brazil — in order to meet state 


US PHARMA FINES 


requirements. California is 
pushing to cut emissions to 
1990 levels (15% below today’s 
levels) by 2020. 


Conflicts of interest 


US medical schools are quickly 
improving policies on conflicts 
of interest between faculty and 
pharmaceutical companies — 
such as restrictions on gifts 
and consulting relationships. 
According to a scoreboard 
released on 15 December 

by the American Medical 
Student Association in 

Reston, Virginia (see www. 
amsascorecard.org), 52% of 
schools scored ‘A or ‘B’ for 
their policies, up from 30% in 
2009 and 14% in 2008. 


IceCube telescope 
Researchers at the South Pole 
have completed construction 
of a giant neutrino telescope 
that consists of an array of 
wires and detectors set deep 

in Antarctic ice. The IceCube 
Neutrino Observatory has been 
under construction since 2005; 
86 wires, at depths of between 
1,450 and 2,450 metres, each 
have 60 basketball-sized 
detectors that look for cosmic 
neutrinos hitting oxygen atoms 
in the water molecules of the 
ice. The final detector string 
was laid on 18 December; the 
full array can start taking data 
in May. See go.nature.com/ 
kqxlen for more. 


The penalties paid out by US pharmaceutical companies have 


increased dramatically in recent years. 


a 


Financial penalties (US$ billions) 


1999 2001 2003 


2005 2007 2009 


One settlement with GlaxoSmithKline for $3.4 billion accounts for the spike in financial 
penalties in 2006. 2010 data include only the first 10 months of the calendar year to 


1 November 2010. 


SEVEN DAYS | THIS WEEK | 


1 JANUARY 

Hungary assumes a six- 
month presidency of the 
European Union. 


3-7 JANUARY 

The Society for 
Integrative and 
Comparative Biology 
meets in Salt Lake City, 
Utah. 
www.sich.org/meetings/2011 


Carbon storage 

The state of Queensland, 
Australia, has said it will not 
fund a proposed flagship 
demonstration project to 
capture carbon dioxide 

(CO,) emissions froma 
coal-fired power station and 
store them underground. 

The ZeroGen project — 

on which the state had 

already spent A$192 million 
(US$191 million) — was 
intended to be a A$4.3-billion 
coal-gasification plant with a 
530-megawatt capacity, storing 
about 2 million tonnes of CO, 
per year, and in operation by 
2015. But Queensland premier 
Anna Bligh said early research 
had shown that the idea was 
“not viable at this time on a 
commercial scale”. ZeroGen 
will now go it alone, becoming 
an entity owned and run by 
industry, Bligh said. 


UK medical hub 

The UK Centre for Medical 
Research and Innovation 
has received the go-ahead 

to begin construction. On 
16 December, the London 
borough of Camden, which 
will host the £500- million 
(US$778-million) facility, 
approved the project, which 
is being funded jointly by 
the government, Cancer 
Research UK, the Wellcome 
Trust and University College 
London. Construction 
should begin next spring and 
finish by 2015. 


> NATURE.COM 
For daily news updates see: 
www.nature.com/news 
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Theos 


Presidential adviser John Holdren drew up guidelines at Barack Obama’s request. 


SCIENCE & POLITICS 


Integrity policy 
unveiled at last 


Mixed reviews greet White House guidelines for preventing 
political interference in US government science. 


BY EUGENIE SAMUEL REICH 


Fe« pages in 648 days. At that rate it 


would have taken Leo Tolstoy centuries 

to write War and Peace. But to get to this 
point, John Holdren, director of the White 
House Office of Science and Technology Pol- 
icy, may have battled through the bureaucratic 
equivalent of the Napoleonic Wars. 


On 17 December, Holdren finally released 
a long-promised set of guidelines for scien- 
tific integrity in US government departments 
and agencies. On the White House website 
Holdren wrote that the document includes 
“a clear prohibition on political interference in 
scientific processes and expanded assurances 
of transparency”. He also wrote that depart- 
ment and agency heads would have 120 days to 


demonstrate progress towards implementing 
the new rules. 

Watchdog groups who campaign for sound 
science in government decision-making gave 
the guidelines a cautious reception. “We will 
just have to wait and see what the agencies do 
with it? says Francesca Grifo of the Union of 
Concerned Scientists (UCS), headquartered 
in Cambridge, Massachusetts. “The jury is 
still out” 

The documentis the product of an initiative 
that began soon after President Barack Obama 
took office. In March 2009, Obama issued a 
memorandum on scientific integrity that for- 
bade the distortion of science for political ends. 
The move seemed to signal a clear departure 
from practices adopted during the administra- 
tion of President George W. Bush, which faced 
accusations of weakening the role of science in 
regulatory agencies and of muzzling scientists 
whose views were at odds with those of the 
White House. 

But the road to implementing Obama’s 
vision has been tortuous. The guidelines, 
expected in July 2009, became mired in 
unwieldy discussion as Holdren struggled to 
get all relevant departments and agencies to 
accept a common set of principles. The US 
Department of the Interior issued a draft 
policy earlier this year, only to backtrack after 
advocacy groups slammed it as incomplete 
and ambiguous. Following the oil spill in the 
Gulf of Mexico in summer 2010, the Obama 
administration was itself accused of suppress- 
ing scientific information to puta better gloss 
on the situation. 

The White House document lays out goals 
for science in government but says little about 
how they should be achieved. It directs agen- 
cies to “ensure that the data and research used 
to support policy decisions undergo inde- 
pendent peer review’, to adopt protection for 
whistleblowers and to “facilitate the free flow of 
scientific and technological information”. 

Roger Pielke of the University of Colorado, 
Boulder, whose research focuses on the inter- 
section of public policy with science, questions 
why it has taken so long to issue such a limited 
document. “It sets forth discussion questions 
about scientific integrity in government, but 
I don't think it resolves them, he says. Pielke 
says that given how long it took to create the 
document, there may not be time for much 
progress before the end of Obama’s term of 
office in 2012. He adds that even if it had 
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> been issued earlier it would not have pre- 
vented the issues around scientific integrity 
that arose during the oil spill. 

Some advocates agree that the docu- 
ment is a disappointment. “It was a very 
long wait for four pages,” says Jeff Ruch 
of Public Employees for Environmental 
Responsibility (PEER), based in Wash- 
ington DC, which has represented several 
scientist whistleblowers. “We feel frustrated 
that this process is horribly off schedule.” 
Ruch says that several sentences have the 
potential to make things worse, rather than 
better, for government 


scientists. Forexam- “J+ gegtg 

ple, the guidelines say forth 

that researchers can “discussion 

speak to the media, questions 

provided there has 

been “appropriate about ‘ 
ut > a, © SCientific 

coordination” with 7% J ae 

public-affairs offices, integrity in 

but they fail to define government, 

what is appropriate. but Idon t 

They also allow scien- think it 

tists to speak publicly resolves 

about their “official them.” 


work” but fail to offer 
protection for scientists who are judged to 
have spoken up in their private capacity. 
“Scientists are free to speak, except when 
they're not,” says Ruch. 

Grifo says that her organization is a lit- 
tle more positive than PEER. She points 
to sections that unambiguously allow gov- 
ernment scientists to serve on the boards of 
scientific societies and journals, to present 
findings at scientific conferences and to 
accept awards and honours for the science 
they do. These are major issues, she adds, 
because the UCS has heard from govern- 
ment scientists who have been prevented 
from doing these things in the past because 
of a perceived conflict of interest. 

But she agrees with Ruch that the media 
policy lacks specificity, and also argues that 
the guidelines should have taken a stronger 
position against scientists with financial 
conflicts of interest serving as advisers to 
the government. 

James Hansen, head of the NASA Goddard 
Institute for Space Studies in New York City, 
who became well known for speaking out 
publicly about censorship of his scientific 
work by NASA press offices during the 
Bush administration, says that the new pol- 
icy does not change either of what he sees 
as two central problems; the use of politi- 
cal appointees to run public-affairs offices, 
and the requirement that the White House 

screen testimonies 


> NATURE.COM that scientists make to 
For more on Congress. “A democ- 
government racy cannot function 
interference see: well with the present 
go.nature.com/p3g9hy approach,” he says. = 
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UK science faces 
facilities freeze 


Four-year budget protects grants but cuts capital spending. 


BY GEOFF BRUMFIEL & NATASHA GILBERT 


ritish scientists hoping for shiny new 
B facilities this Christmas will be disap- 

pointed by their government's research- 
funding plans. 

On 20 December, the Department of Busi- 
ness Innovation and Skills, which oversees 
research and higher-education funding, 
unveiled a four-year budget which makes deep 
cuts to cash for large projects such as particle 
accelerators, research ships and university lab 
space (see ‘Capital crunch’). Meanwhile, two 
of the councils that support specific areas of 
research announced that they will put a new 
emphasis on the economic impact and social 
benefit of the work they fund. The net effect 
will be a squeeze on money for new projects 
and blue-skies research in the coming years. 

By cutting the £873-million (US$1.3-billion) 
annual capital budget by roughly 40%, the gov- 
ernment says it can maintain grant funding at 
the current level. Yet several key facilities will 
be shielded from the capital cut, including the 
UK Centre for Medical Research and Innova- 
tion, a new £500-million biomedical laboratory 
in central London. The budget also protects a 
handful of other planned facilities, and inter- 
national subscriptions to organizations such 
as CERN, the European high-energy physics 
laboratory located near Geneva, Switzerland. 

But some research councils will struggle to 


CAPITAL CRUNCH 


UK government funding for research has been 
protected, but at the expense of cash for buildings 
and major projects. 
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cope with the cuts. The Natural Environment 
Research Council (NERC) said that it remained 
committed to a handful of key projects, includ- 
ing a replacement for its research vessel Dis- 
covery. But no new projects are likely to start 
in the next four years, according to Marion 
O’Sullivan, a NERC spokeswoman. Similarly, 
the Medical Research Council says the capital 
reductions will pose “challenges”, according 
to a statement from John Jeans, the council’s 
deputy chief executive. 

The UK governments efforts to squeeze as 
much valueas possible from its research spend- 
ing has also led two of the research councils to 
announce changes to their missions. The Bio- 
technology and Biological Sciences Research 
Council (BBSRC) no longer sees itselfas a sci- 
ence ‘funder’, but rather as an investor of public 
funding in science. Matt Goode, a spokesman 
for the BBSRC, says this refocus is a “subtle 
semantic change” and that the council is not 
abandoning basic research. Meanwhile, the 
Engineering and Physical Sciences Research 
Council (EPSRC) announced that it would 
become a “sponsor” of research. “Funding 
is viewed as a strategic investment and not a 
transfer of funds without obligations,” David 
Delpy, the EPSRC’s chief, said in a video mes- 
sage explaining the shift. Researchers would 
be asked to think about impact at every stage 
of the research process, Delpy said. 

“Obviously this is sheer lunacy,” says Paul 
Clarke, a chemist at the University of York, 
UK. “If knew what the impact of the research 
would be, I wouldn't have to do the research” 

Research funds for English universities will 
also be squeezed. The Higher Education Fund- 
ing Council for England (HEFCE) will have its 
annual £1.6 billion for research grants cut by 
about 3% over the next four years (universi- 
ties elsewhere in Britain are overseen by other 
bodies). But like the research councils, the big- 
gest cuts hit the capital budget, which will be 
slashed by 40% from its present level of £167 
million over the same period. The HEFCE 
will announce how it will slice up its budget 
between universities in March 2011. 

Imran Khan, director of the Campaign for 
Science & Engineering in the UK, a London- 
based advocacy group, fears that some research 
councils may be forced to dip into money 
intended for basic research to make up for the 
capital shortfall. “The money will have to come 
from somewhere,’ he says. = 
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work” but fail to offer 
protection for scientists who are judged to 
have spoken up in their private capacity. 
“Scientists are free to speak, except when 
they're not,” says Ruch. 

Grifo says that her organization is a lit- 
tle more positive than PEER. She points 
to sections that unambiguously allow gov- 
ernment scientists to serve on the boards of 
scientific societies and journals, to present 
findings at scientific conferences and to 
accept awards and honours for the science 
they do. These are major issues, she adds, 
because the UCS has heard from govern- 
ment scientists who have been prevented 
from doing these things in the past because 
of a perceived conflict of interest. 

But she agrees with Ruch that the media 
policy lacks specificity, and also argues that 
the guidelines should have taken a stronger 
position against scientists with financial 
conflicts of interest serving as advisers to 
the government. 

James Hansen, head of the NASA Goddard 
Institute for Space Studies in New York City, 
who became well known for speaking out 
publicly about censorship of his scientific 
work by NASA press offices during the 
Bush administration, says that the new pol- 
icy does not change either of what he sees 
as two central problems; the use of politi- 
cal appointees to run public-affairs offices, 
and the requirement that the White House 
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ritish scientists hoping for shiny new 
B facilities this Christmas will be disap- 

pointed by their government's research- 
funding plans. 

On 20 December, the Department of Busi- 
ness Innovation and Skills, which oversees 
research and higher-education funding, 
unveiled a four-year budget which makes deep 
cuts to cash for large projects such as particle 
accelerators, research ships and university lab 
space (see ‘Capital crunch’). Meanwhile, two 
of the councils that support specific areas of 
research announced that they will put a new 
emphasis on the economic impact and social 
benefit of the work they fund. The net effect 
will be a squeeze on money for new projects 
and blue-skies research in the coming years. 

By cutting the £873-million (US$1.3-billion) 
annual capital budget by roughly 40%, the gov- 
ernment says it can maintain grant funding at 
the current level. Yet several key facilities will 
be shielded from the capital cut, including the 
UK Centre for Medical Research and Innova- 
tion, a new £500-million biomedical laboratory 
in central London. The budget also protects a 
handful of other planned facilities, and inter- 
national subscriptions to organizations such 
as CERN, the European high-energy physics 
laboratory located near Geneva, Switzerland. 

But some research councils will struggle to 
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cope with the cuts. The Natural Environment 
Research Council (NERC) said that it remained 
committed to a handful of key projects, includ- 
ing a replacement for its research vessel Dis- 
covery. But no new projects are likely to start 
in the next four years, according to Marion 
O’Sullivan, a NERC spokeswoman. Similarly, 
the Medical Research Council says the capital 
reductions will pose “challenges”, according 
to a statement from John Jeans, the council’s 
deputy chief executive. 

The UK governments efforts to squeeze as 
much valueas possible from its research spend- 
ing has also led two of the research councils to 
announce changes to their missions. The Bio- 
technology and Biological Sciences Research 
Council (BBSRC) no longer sees itselfas a sci- 
ence ‘funder’, but rather as an investor of public 
funding in science. Matt Goode, a spokesman 
for the BBSRC, says this refocus is a “subtle 
semantic change” and that the council is not 
abandoning basic research. Meanwhile, the 
Engineering and Physical Sciences Research 
Council (EPSRC) announced that it would 
become a “sponsor” of research. “Funding 
is viewed as a strategic investment and not a 
transfer of funds without obligations,” David 
Delpy, the EPSRC’s chief, said in a video mes- 
sage explaining the shift. Researchers would 
be asked to think about impact at every stage 
of the research process, Delpy said. 

“Obviously this is sheer lunacy,” says Paul 
Clarke, a chemist at the University of York, 
UK. “If knew what the impact of the research 
would be, I wouldn't have to do the research” 

Research funds for English universities will 
also be squeezed. The Higher Education Fund- 
ing Council for England (HEFCE) will have its 
annual £1.6 billion for research grants cut by 
about 3% over the next four years (universi- 
ties elsewhere in Britain are overseen by other 
bodies). But like the research councils, the big- 
gest cuts hit the capital budget, which will be 
slashed by 40% from its present level of £167 
million over the same period. The HEFCE 
will announce how it will slice up its budget 
between universities in March 2011. 

Imran Khan, director of the Campaign for 
Science & Engineering in the UK, a London- 
based advocacy group, fears that some research 
councils may be forced to dip into money 
intended for basic research to make up for the 
capital shortfall. “The money will have to come 
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the proportion of publications in each building where first and last authors work (grey is low, blue is high). Statistically, bluer buildings are also higher. 
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Love thy lab neighbour 


Getting closer to your collaborators boosts a paper’s citations. 


BY RICHARD VAN NOORDEN 


nyone who has worked in a laboratory 
Agrees feels that having key mem- 
bers of the group placed closer together 
makes for a better research project. A study link- 
ing the proximity of investigators and the impact 
of their research now backs up that hunch. 
Isaac Kohane, co-director of the Harvard 
Medical School Center for Biomedical Infor- 
matics in Boston, Massachusetts, decided to 
put intuition to the test in 2005 after a debate 
with Harvard’s dean of administration, 
Richard Mills, over the layout of the centre. 
“T felt this viscerally, but there was no hard 
evidence,’ says Kohane. He enlisted more than 
a dozen undergraduates to identify 35,000 
articles published between 1999 and 2003 in 
biomedical sciences, each with at least one 
Harvard author. It took the team two years to 
pinpoint where individual Harvard investiga- 
tors were working — right down to the level of 
individual offices and laboratories. 
The results, published in PLoS ONE last 
week (K. Lee et al. PLloS ONE 5, €14279; 2010), 


show that the shorter the geographical distance 
between first and last authors on a paper, the 
more highly cited were their research papers. 
First authors often bear the brunt of the work, 
whereas last authors tend to take the lead 
organizational role — and both are key players 
in the research project. The distance trend was 
not found for middle authors, who could be far 
removed from other collaborators without any 
clear effect on research impact. 

Kohane and his colleagues also looked at 
individual buildings on the four campuses 
across which Harvard life-science research 
happens to be spread. They found that the 
more that researchers within a building tended 
to collaborate with one another rather than 
with people elsewhere, the more highly cited 
the publications that came from that building 
(see picture). The team does acknowledge an 
alternative explanation for the data: that scien- 
tists might choose to keep 
potentially high-impact 
breakthroughs within their 
own laboratory, or within a 
close circle of researchers. 


See Nature's cities 
special: 


This seems to be the first empirical study of 
the connection between proximity and impact, 
says Anthony van Raan, an expert in using cita- 
tion analyses to study scientific productivity 
and impact at Leiden University, the Nether- 
lands. Most studies of the relationship between 
spatial separation and scientific impact have 
been done on a national and international 
scale, for which it has been demonstrated 
many times that international collaborations 
produce more highly cited science than local 
collaborations — probably a consequence of 
the size and scope of such efforts. 

Kohane speculates that international 
collaborations might become even more 
successful if the first and last authors worked 
very close together, something that has not 
yet been tested. He certainly practises what 
he preaches: he and first author Kyungjoon 
(Joon) Lee, who coordinated the undergradu- 
ates’ fact-finding, now work on the same floor. 
“When the study started we were on different 
floors,’ says Kohane, “and Joon told me that I 
became a lot more helpful when I moved to 
his floor? = 
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A finger bone and a tooth (inset) from Denisova Cave have illuminated a mysterious strand of hominin. 


PALAEOANTHROPOLOGY 


Fossil genome 
reveals ancestral link 


A distant cousin raises questions about human origins. 


BY EWEN CALLAWAY 


he ice-age world is starting to look 
[essen While Neanderthals held 

sway in Europe and modern humans 
were beginning to populate the globe, another 
ancient human relative lived in Asia, according 
to a genome sequence recovered from a finger 
bone ina cave in southern Siberia. A compara- 
tive analysis of the genome with those of mod- 
ern humans suggests that a trace of this poorly 
understood strand of hominin lineage survives 
today, but only in the genes of some Papuans 
and Pacific islanders. 

Named after the cave that yielded the 
30,000-50,000-year-old bone, the Denisova 
nuclear genome follows publication of the same 
individual’s mitochondrial genome in March’. 
From that sequence, Svante Paabo of the Max 
Planck Institute for Evolutionary Anthropology 
in Leipzig, Germany, and his colleagues could 


2 


MORE 
ONLINE 


2010 IN REVIEW 


tell little, except that the individual, now known 
to be female, was part of a population long 
diverged from humans and Neanderthals. 

Her approximately 3-billion-letter nuclear 
genome, reported in this issue of Nature’, now 
provides a more telling glimpse into this mys- 
terious group. It also raises previously unimag- 
ined questions about its history and relationship 
to Neanderthals and humans. “The whole story 
is incredible. It’s like a surprising Christmas 
present,’ says Carles Lalueza Fox, a palaeogenet- 
icist at Pompeu Fabra University in Barcelona, 
Spain, who was not involved in the research. 

When the ancient genome was compared to 
a spectrum of modern human populations, a 
striking relationship emerged. Unlike most 
groups, Melanesians — inhabitants of Papua 
New Guinea and islands northeast of Australia 
— seem to have inherited as much as one- 
twentieth of their DNA from Denisovan roots. 
This suggests that after the ancestors of today’s 


Papuans split from other human populations 
and migrated east, they interbred with Denis- 
ovans, but precisely when, where and to what 
extent is unclear. 

More answers could come from a closer look 
at Denisovan, human and even Neanderthal 
DNA. So far, conclusions about interbreed- 
ing have been drawn from a relatively small 
number of human genomes using conservative 
DNA-analysis methods, says David Reich, a 
geneticist at Harvard Medical School in Boston, 
Massachusetts, who led the Denisova analysis. 
“There may have been many more interactions,” 
he says. Paabo says it may be possible to deter- 
mine roughly when humans interbred with 
Denisovans by examining the length of DNA 
segments lurking in various human genomes, 
with shorter segments corresponding to more 
shuffling of genes and a longer elapsed time. 

A molar discovered in the same cave also 
yielded mitochondrial DNA resembling that of 
the finger bone. But the Denisovans were prob- 
ably more widespread, says Paabo. Some fos- 
sils from China, for example, resemble neither 
Neanderthals nor modern humans — nor Homo 
erectus, an earlier human ancestor. Paabo won- 
ders whether they could be more closely related 
to Denisovans. His Russian collaborators plan to 
search for more complete Denisovan fossils that 
could be matched to others from China. 

Chris Stringer, a palaeoanthropologist at 
London’s Natural History Museum, agrees 
that Asian fossils, such as the 200,000-year- 
old Dali skull from central China, could have 
links to the Denisovans. But he says that firm 
conclusions about such relationships will 
have to await the discovery of more complete 
Denisovan fossils. 

Preserved DNA from other Asian fossils 
would also provide a clearer picture of the Den- 
isovans, which Paabo, to sidestep controversy, 
has opted not to call a new species or subspecies 
ofhominin. The challenge will be to make sense 
of such discoveries and put them in the context 
of ancient human history, says Lalueza Fox. 
Palaeoanthropologists are just beginning to 
scrutinize the Neanderthal genome published 
earlier this year® for clues to ancient human 
history. With the Denisova genome, “they will 
need to deal with another surprise’, he says. = 
SEE ALSO NEWS & VIEWS P.1044 
1. Krause, J. et al. Nature 464, 894-897 (2010). 
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3. Green, R. E. et al. Science 328, 710-722 (2010). 
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Deep lab denied funding 


Divisions within US National Science Foundation throw 
plans for underground science facility into crisis. 


BY EUGENIE SAMUEL REICH 


mbitious plans to build one of the 
Aw deepest underground labora- 

tories have suffered a serious setback. 
The US National Science Board (NSB) has 
refused to continue to fund the design of the 
Deep Underground Science and Engineer- 
ing Laboratory (DUSEL), leaving some 1,000 
researchers hoping to do science there uncer- 
tain about its future. 

The lab is set to be housed in Homestake, 
a former goldmine near Lead, South Dakota. 
The mine is an ideal location for sensitive 
experiments trying to catch sight of hard-to- 
detect particles such as neutrinos and dark 
matter. At almost 2,500 metres deep, it would 
shield DUSEL from the cosmic rays that would 
otherwise drown out signals from the lab’s elu- 
sive targets. 

The US National Science Foundation (NSF) 
and its partners, including the US Department 
of Energy, have already committed more than 
$300 million towards DUSEL, which is expected 
to cost more than $800 million in total. But 
Edward Seidel, assistant director for mathemati- 
cal and physical sciences at the NSE, says that the 
$29 million awarded to the University of Cali- 
fornia, Berkeley, in 2009 to design and prepare 
the mine for DUSEL has proved inadequate. 

Safety concerns arose earlier this year about 
the mine shafts that scientists will use to access 
the facility. It is also proving difficult to pump 
groundwater from the ageing mine. As funds 
could not be reallocated from other parts of 
the project, programme managers this month 
requested another $19 million now, with per- 
haps another $10 million to come in the spring 
of 2011, to continue that preparatory work. 

But the NSB, which must approve large 
outlays by the NSE refused both requests on 
2 December. Although the infrastructure for 
each of the lab’s experiments will be managed 
by an allocated lead agency, the board was con- 
cerned by the perceived lack ofa clear steward- 
ship plan for the mine’s infrastructure, raising 
the prospect that the NSF could face balloon- 
ing costs. Board members also believe that the 
energy department should contribute more 
than its current commitment of $100 million. 

“We dont know if this is a glitch or a death 
knell. I think the users 
feel like we're in limbo 
right now,’ says Steven 
Elliott, a neutron scientist 
at Los Alamos National 


> NATURE.COM 
For alonger version 
of this story, see: 
go.nature.com/8p2try 


Laboratory in New Mexico and chairman of the 
executive committee of the DUSEL Research 
Association, which represents the researchers 
who expect to do science at the new lab. 

The NSB’s decision also exposes internal 
differences at the NSF about the best way to 
pay for major science infrastructure projects 
within a funding system more attuned to sup- 
porting research programmes. Although the 
decision does not mean that the NSF will not 
build or steward the facility, says Seidel, “it’s 
clear that the current stewardship model will 
have to change”. The DUSEL project team is 
now talking to all of its partners to formulate a 
plan to keep preparatory work going. 

Much of that work involves designing the 
experiments that will be lowered into caverns 

near the surface, and 


“It’s clearthat — 1,500 and 2,300 metres 
the current underground. These 
stewardship include the Long Base- 
model will line Neutrino Experi- 
have to ment — in which 
change.” neutrinos will be fired 

at detectors in DUSEL 


from 1,000 kilometres away at Fermilab in 
Batavia, Illinois, to find out why there is so 
much more matter than antimatter in our Uni- 
verse — and LUX, the world’s most sensitive 
search for dark matter. Beyond particle phys- 
ics, DUSEL is expected to include a broad suite 
of geophysical and biological experiments, as 
well as a facility for testing the effects of seques- 
tering carbon dioxide deep underground. 

Building the lab would allow the United 
States to compete effectively with other coun- 
tries that have underground facilities, such as 
the Super-Kamiokande neutrino observatory 
near Hida, Japan. “Here in the United States we 
are conspicuous in not having a deep under- 
ground science lab, in contrast to other coun- 
tries with large science programmes,’ says Rick 
Gaitskell, a particle astrophysicist at Brown 
University in Providence, Rhode Island, who 
hopes to use DUSEL. 

The Homestake mine currently houses the 
Sanford Underground Laboratory, which hosts 
smaller-scale versions of experiments intended 
for DUSEL. The additional funding is needed 
in part to transition the operations of this lab 
into the planned construction of DUSEL. The 
lab has enough funding to maintain the mine 
until May 2011, says its spokesman, Bill Harlan. 
A final NSF decision on whether to go ahead 
with DUSEL was expected in 2011, but may 
now be delayed. = 
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BY ADAM MANN 


: the year in which... 


In a year marked by environmental disasters, Pakistan was perhaps hardest hit, as a flood affected an estimated 20 million people. 


Natural disasters pummelled Earth 


In January, a magnitude-7.0 earthquake struck Haiti — the most 
violent such event to strike the impoverished nation ina century. An 
estimated 230,000 people died and a further 1 million were left home- 
less. Other earthquakes, including a magnitude-8.8 quake in Chile 
in February and a magnitude 7.1 in New Zealand in September, also 
caused widespread damage, but smaller death tolls. Ash from the erup- 
tion of volcano Eyjafjallajokull in Iceland grounded commercial flights 
across Europe for a week in April, stranding thousands of travellers 
(see ‘Quotes of the year’). And unusually intense rains related to the 
La Nifia cooling of the Pacific Ocean flooded one-fifth of Pakistan 
and affected an estimated 20 million people. The weather pattern was 
also implicated in a drought in Russia as the country experienced 
the hottest summer in its recorded history, unleashing hundreds of 
deadly wildfires. 


Ancient kissing cousins were found 


Two reports suggested that modern man carries genes from extinct 
branches of the human family tree. The question of whether Neander- 
thals, which went extinct about 30,000 years ago, 
ever mated with humans had been hotly debated, 
Read Nature's 


but evidence had been sparse. Even the sequenc- 
ing of the Neanderthal genome in 2009 provided _ special section 
no definitive evidence. On 7 May, researchers __ reviewing the year. 


announced the results of a genetic analysis of nearly 
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2,000 people from around the world, which yielded signs of gene flow 
between Neanderthals and Homo sapiens around the time that modern 
humans first migrated out of Africa some 50,000-60,000 years ago. Evi- 
dence of more recent mixing appears in this issue of Nature. A genome 
extracted from a 30,000-50,000 year-old finger bone found in a Siberian 
cave not only attests to the existence of another hominin group, but sug- 
gests that the group interbred with a particular band of human migrants 
that were ancestors of today’s Melanesians. 1044 & 1053 


Doctors gained new weapons against HIV 


In July, researchers revealed that an antiretroviral microbicide gel cut HIV 
infection by up to 54% in women who used it regularly. The findings, 
which came from a study of about 900 South African women at high 
risk of infection, gave hope to those seeking to bring down the rate of 
HIV infection in sub-Saharan Africa, where the majority of new cases 
occur. Another breakthrough came in November, when a study of nearly 
2,500 men showed that the antiretroviral drug Truvada is an effective pre- 
ventative measure. Among men who have sex with men, those who took 
the drug consistently lowered their risk of acquiring the virus by 73%. 


Scientists unveiled a synthetic genome 


Ina bold step towards designer life, researchers at the J. Craig Venter 
Institute in Rockville, Maryland, announced on 20 May that an artificial 
genome inserted into a bacterium had successfully commandeered the 
cell and commenced replication. Using the genome from the bacterium 
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Mycoplasma mycoides as a blueprint, Daniel Gibson and his colleagues 
at the institute assembled their synthetic genome in a yeast cell and 
transplanted it into the closely related species Mycoplasma capricolum. 
Although the 1.1-million-base-pair sequence was a near duplicate of 
M. mycoides, it included four special ‘watermark sequences to distin- 
guish it from the original, as well as a hidden code that, once deciphered, 
included a website address and several famous quotes. Some research- 
ers considered the move to be a significant advance over conventional 
genetic engineering, although others argued that scientists are a long 
way from being able to design and construct novel bacteria from scratch. 
If the technology advances sufficiently, many hope that artificial life can 
be used for a variety of tasks, including carbon sequestration, biofuel 
production or the clean-up of chemicals. 


Oil gushed into the Gulf of Mexico 


On 20 April, an explosion on BP’s Deepwater Horizon oil rig killed 
11 workers and precipitated one of the worst oil spills in history. By 
August, the damaged well had dumped nearly 5 million barrels of oil into 
the Gulf of Mexico, spewing as much as 62,000 barrels a day at its peak. 
Engineers capped the well on 15 July, although it was not permanently 
sealed with cement until 19 September. During the spill, researchers 
detected large plumes of oil below the water’s surface. In its aftermath, 
debate continued over where all the oil had gone. An estimate released by 
the US National Oceanic and Atmospheric Administration, which sug- 
gested that about half of the oil had dispersed, dissolved or evaporated, 
was roundly criticized as too optimistic. Later, researchers discovered a 
layer of precipitated oil on the sea floor. SEEP.1024 


Climate-change policy stalled 


Efforts to confront climate change stumbled early on, but finished the 
year on a positive note. In January, the Intergovernmental Panel on Cli- 
mate Change, chaired by Rajendra Pachauri, was embarrassed to learn 
that a 2007 report had erred when it stated that all glaciers in the central 
and eastern Himalayas could melt by 2035. The claim had not come 
from peer-reviewed scientific literature, but from a comment by Indian 
glaciologist Syed Hasnain in a 1999 article in New Scientist, and the 
mistake provided fodder for climate-change sceptics. Over the summer, 
three US senators — John Kerry (Democrat, Massachusetts), Joseph 
Lieberman (Independent, Connecticut) and Lindsey Graham (Repub- 
lican, South Carolina) — failed to push through a bill that would have 
instituted a cap-and-trade system for domestic industry's carbon emis- 
sions, even though the House of Representatives had approved a similar 
bill. The end-of-the-year United Nations Framework Convention on > 


Rajendra Pachauri felt the heat in 2010. 
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Quotes of the year 


“The first self -replicating species we’ve had 
on the planet whose parent is a computer.” 


Craig Venter describes the artificial bacterium created at his lab. 
Source: New York Times 


“If we can turn the oilinto smoke, 
we’llall be happy.” 


Ed Levine, National Oceanic and Atmospheric Administration 
scientific support coordinator for the Deepwater Horizon spill effort, 
on burning the oil in open water before it reaches land. 
Source: Nature 


“When I got the telephone call, Ithought, 
‘Oh shit!’ The second thought that came 
to my mind: ‘Oh dear, I will not win many 
more prizes.’” 


Andre Geim describes how he felt after learning he would share 
this year’s Nobel Prize in Physics. 
Source: Nature’s The Great Beyond blog 


“My own personal feeling is that the 
chances of life on this planet are 100%.” 


Astronomer Steven Vogt on his team’s discovery of an extrasolar 
planet orbiting in the ‘habitable zone’ of the star Gliese 581. Other 
scientists have been unable to find evidence for the planet. 

Source: Daily Telegraph 


“We checked every option, but there were 
no boats and no train tickets available. That’s 
when my fabulous assistant determined the 
easiest thing would be to take a taxi.” 


Comedian John Cleese, who made the journey from Oslo to 
Brussels following the eruption of Eyjafjallajokull in said taxi, at a 
cost of 30,000 Norwegian kroner (US$5,000). 

Source: Sky News Online 


“There might be some interesting 
application, but frankly I don’t have one now.” 


Physicist Andrew Cleland speaks about placing a 30-micrometre 
mechanical paddle into a superposition of quantum mechanical 
states so that it was simultaneously vibrating and not vibrating. 

Source: Nature 


“They have managed to reach an agreement 
by moving the goalposts closer to the ball.” 


David Victor, director of the Laboratory on International Law and 
Regulation at the University of California, San Diego, discusses the 
international climate agreement reached in Cancun. 

Source: Nature 
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> Climate Change meeting in Cancun, Mexico, brought some good 
news when participants agreed to set a goal of limiting average warming 
to 2°C above preindustrial levels. Building on the ultimately unsuccess- 
ful Copenhagen Accord from last year, countries also created an inter- 
national tracking system to report progress on lowering emissions. 


Japan’s space agency had a hit and a miss 


On 16 November, researchers confirmed that micrometre-sized grains 
found in the Japan Aerospace Exploration Agency’s Hayabusa space- 
craft were authentic asteroid dust. The mission, which gently kissed the 
surface of the Itokawa asteroid twice in November 2005, is the first to 
retrieve asteroidal material and return it to Earth for study. A month 
later the agency experienced a setback when its Akatsuki spacecraft 
failed to enter orbit around Venus, instead sailing past the planet into 
interplanetary space. Akatsuki, which would have mapped Venus using 
infrared cameras that can peer beneath its dense cloud layer and search 
for evidence of recent volcanic activity, will have to orbit the Sun and 
wait six more years for another try. 


Stem-cell research rode a roller coaster 


US scientists were jolted on 23 August when federal district court judge 
Royce Lamberth placed an injunction on federally funded human 
embryonic stem-cell research. The move also overrode the March 2009 
executive order of US President Barack Obama mandating the National 
Institutes of Health to develop a policy for the approval of new stem-cell 
lines, which had been prohibited under the administration of George 
W. Bush. The injunction was to remain in force until Judge Lamberth 
decided whether the research violates the Dickey-Wicker Amendment, 
which prohibits the destruction ofhuman embryos in research. But on 
9 September, the US Court of Appeals for the District of Columbia Cir- 
cuit issued a stay on the injunction, allowing federal funding to continue 
until the court rules on whether Lamberth’s injunction should stand. 
Some federal stem-cell research resumed, but scientists are braced for 
more setbacks. Unless federal law is changed, many say the argument 
will ultimately find its way to the Supreme Court. 


Astronomers joined the dark side 


In August, US astronomers released the Astro2010 Decadal Survey, a 
highly influential document that, once every ten years, recommends 
which astronomy and astrophysics projects NASA, the National Science 
Foundation and the Department of Energy should fund. Acknowledg- 
ing the prospect of budget cuts during the economic downturn, the 
report recommended a few large, expensive projects — such as the 
US$1.6-billion Wide Field Infrared Survey Telescope (WFIRST), a 
1.5-metre space-based instrument that could investigate dark energy, 
the mysterious phenomenon that is causing the expansion of the Uni- 
verse to accelerate. But November brought unwelcome news: a report 
commissioned by Senator Barbara Mikulski (Democrat, Maryland) 
concluded that the 6.5-metre James Webb Space Telescope, successor to 
the Hubble, would come in at least $1.5 billion over budget and would 
be delayed for more than a year. This implicit expected drain on NASA's 
budget leaves funding for WFIRST an open question. 


The budget crunch hit European science 


Austerity measures across many European countries stricken by the 
financial crisis took a toll on scientists. The five member states con- 
tributing to CERN, Europe’s particle-physics laboratory near Geneva, 
Switzerland, approved a plan in September to reduce contributions by 
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A Japanese capsule bearing asteroidal dust is recovered in Australia. 


about $140 million over the next five years and to slow down the pace 
of smaller research projects to protect the lab’s flagship Large Hadron 
Collider, the world’s largest particle accelerator. And Italy and Britain 
said they will temporarily reduce their contributions to the European 
Synchrotron Radiation Facility in Grenoble, France. Other countries, 
looking to slash their budgets, announced freezes or reductions in sci- 
entific investments; for example, the Spanish government's expenditure 
in research and development will drop by 8.37% next year. In October, 
UK scientists fought back against the funding cuts, rallying to protest 
in London. Eventually, the British government decided not to reduce 
science spending and agreed to protect the £4.6 billion (US$7.3 billion) 
core science budget over the next four years. SEE P.1010 


Arsenic-based life was discovered. Or not. 


A cryptic announcement from NASA in November said that the agency 
had important astrobiology news, leading many to speculate that it was 
set to unveil extraterrestrial life. Instead, during a media conference on 
2 December, researchers announced the discovery of ordinary earthly 
bacteria from Mono Lake in California that seemed to do something 
extraordinary — use arsenic as a building block for DNA and proteins, 
in place of the phosphorus relied on by other organisms. But as soon 
as the unprecedented finding was made public, it drew sharp criticism 
from the scientific community. Biochemists took to the blogosphere, 
attacking the methodology and assumptions of the original research and 
provoking a flurry of articles in the media. Further work will be needed 
to settle whether the bacteria actually do use arsenic in their biochem- 
istry as opposed to just cleverly thwarting its toxic effects. 


Amorality expert was accused of mischief 


In August, Harvard University found that Marc Hauser, a leader in the 
field of animal and human cognition, had committed eight counts of 
scientific misconduct. Hauser studies the evolutionary origin of char- 
acteristics such as morality, language and mathematical ability, and his 
work has been profiled in many news outlets, such as The New York Times 
and the Wall Street Journal. Many scientists in the field called on Harvard 
to release the details of its investigation, saying that they could affect any 
research that uses Hauser’s as a basis. As yet, Harvard has not done this. 
Hauser has retracted or amended at least three papers, which appeared 
in Cognition, Proceedings of the Royal Society and Science, respectively. 
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She set out to revolutionize US ocean management 


— but first she faced the oil spill. Jane Lubchenco 


is Nature ’s Newsmaker of the Year. 


BY RICHARD MONASTERSKY 


air and splashes back down just a few metres away. The 63-year-old 

marine ecologist is out on a boat near Pascagoula, Mississippi, with 
a team of researchers studying how the recent oil spill in the Gulf of 
Mexico has affected dolphin communities there. 

On this October day, Lubchenco wears starfish-shaped earrings and 
a cap emblazoned with the letters ‘NOAA, for National Oceanic and 
Atmospheric Administration. Her shirt sports a NOAA logo, as does 
her life vest. Rarely does she venture out in public without some symbol 
of the US government agency she has proudly run since March 2009. A 
sprawling department of 12,800 people with a budget of US$4.7 billion, 
NOAA has responsibilities stretching from the bottom of the sea to the 
top of the atmosphere and even to the Sun, which it monitors for signs 
of solar storms. That mandate put Lubchenco at the centre of the gov- 
ernment’s response to the BP Deepwater Horizon oil-spill disaster — a 
brutal test for a scientist with little previous management experience. 

On board the boat, she relishes the chance to talk about dolphin 
behaviour with the NOAA researchers, but seems to get the big- 
gest kick when the pilot gives her a turn at the wheel. Gripping the 
throttle, Lubchenco has to be reminded to stay below the speed limit 
as she motors through the narrow waterway. 

Going slow does not come easily to the NOAA leader. Asa celebrated 
scientist and vocal conservationist, she made her name urging other 
researchers to speak out on issues of public importance, a stance that 
not all of her academic colleagues were comfortable with. Now, at an 
age when many of her cohort are easing back, she is taking on the most 
ambitious challenge of her career: reorienting how the nation responds 
to pressing environmental problems such as dwindling fish stocks, rising 
seas and a changing climate. She has bold plans to strengthen scientific 
research at NOAA, make it more relevant to society and improve the 
health of ecosystems and coastal communities. 

But the path has not been smooth for Lubchenco, who took over the 


Je: Lubchenco smiles as a dolphin leaps out of the water, arcs in the 


Lubchenco 
testifies ata 
Senate hearing 
on the Deepwater 
Horizon oil spill. 
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agency in troubled times. With the economy 
in a nose dive and many coastal communities 
struggling, NOAA’ policies to limit fishing 
have proved so contentious that members of 
US President Barack Obama’s own party called 
for Lubchenco to resign. And the oil-spill dis- 
aster has severely tested her political skills. 
Some of her natural constituency — scientists 
and environmentalists — have accused her of 
quashing independent researchers, suppress- 
ing information and misleading the public. 
Although she admits to some communica- 
tions problems during the crisis, Lubchenco 
shakes off the broader criticisms. “'m very 
proud of what we did during the heat of the 
moment,’ she says. NOAA closed down fish- 
eries, forecast where currents would sweep the 
oil, monitored storms during one of the most 
active hurricane seasons on record, protected 
endangered marine species and is leading the 
effort to assess damage done by the oil. “I give 
her very high marks as a leader in what has 
been a difficult time for NOAA,” says Michael 
Jackson, who was deputy director of the US 
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Department of Homeland Security in 2005, 
during Hurricane Katrina. 

Throughout this day on the Gulf of Mexico, 
Lubchenco keeps up a hectic pace, visiting 
multiple sites in the Alabama and Mississippi 
area. This is her eleventh trip to the Gulf of 
Mexico region since the Deepwater Horizon 
oil rig exploded on 20 April, unleashing the 
largest single marine spill in US history. 

In person, Lubchenco makes an easy con- 
nection with strangers. She looks them in the 
eye and asks about their jobs and how the 
spill affected them. Before lunch, she meets 
more than two dozen teachers from across the 
Gulf and starts by telling them how much she 
appreciates their work. “My sister is a middle- 
school science teacher. My daughter-in-law 
is a high-school science teacher, and I was 
strongly affected by teachers,” she says. 

The teachers introduce themselves and talk 
about how the spill touched their students, 
many of whose parents were put out of work 
when the spreading oil closed fishing grounds 
and drove away tourists. The teachers thank 


Lubchenco for all the information that NOAA 
posted on its website, which their classes used 
to find out which fishing areas were closed, 
where the winds were going and whether cur- 
rents would carry the oil out of the Gulf. “We 
would check your site every day,” said one 
teacher. “We used so much of that data.” 


CRISIS MANAGEMENT 
With the well capped and the oil dispersing, 
Lubchenco has entered calmer waters after 
the tumultuous spring and summer of the 
crisis. She was one of the ‘principals’ — the 
top administration officials working on the 
spill, who regularly briefed President Obama 
and rarely rested. Two weeks after the rig 
exploded, she ran into an old friend at a party 
in Washington. 

“Jane, you look really tired?’ he told her. 

“Yeah, I'm sleeping three or four hours a 
night,’ she confided to him. 

Such was the toll of running the lead ocean 
agency during one of the biggest environ- 
mental disasters in US history. The task was 


complicated by a series of communications 
missteps, her own and those of other officials, 
which drew accusations that she had withheld 
information about the environmental toll of 
the spill. 

The first flashpoint was the question of how 
much oil was leaking from the wellhead and 
where it was going. Days after the spill, when 
BP was estimating that 1,000 barrels of oil were 
pouring out each day, a NOAA researcher 
arrived at a far higher figure of 5,000-10,000 
barrels — a “very rough estimate’, his e-mail 
warned. But that was not released to the pub- 
lic. Instead, a Coast Guard admiral in charge of 
responding to the spill said in a press confer- 
ence on 28 April that “NOAA experts believe 
the output could be as much as 5,000 barrels”. 

That figure stood as the sole government 
estimate for a month. At the same time, inde- 
pendent researchers came up with estimates 
in the range of 25,000-100,000 barrels a day. 
Months later, the government concluded that 
the well had gushed 62,000 barrels a day ini- 
tially and then declined to 53,000 (a figure 
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that BP contends is too high). 

Other issues also suggested to some that 
NOAA and the rest of the government were 
downplaying the magnitude of the problem. 
In mid-May, academic scientists working in 
the Gulf started finding evidence that untold 
amounts of oil were spreading away from 
the wellhead and forming vast plumes some 
1,200 metres below the surface’. NOAA ini- 
tially questioned the evidence and dismissed 
media reports as “misleading”, even as more 
evidence emerged. Donald Boesch, president 
of the University of Maryland Center for 
Environmental Science in Cambridge and a 
member of a commission that subsequently 
reviewed the government’s response, says that 
was a mistake. “Jane was too dismissive about 
the fact that there could bea significant deep- 
water plume there,” he says. On 8 June, after 
analysis of more data collected by academic 
scientists, NOAA acknowledged the presence 
of diffuse plumes of oil beneath the surface. 


THE FATE OF THE OIL 

On 15 July, BP finally succeeded in capping 
the well, but there were still major questions 
about what had happened to all the oil that had 
escaped over the past three months. In early 
August, NOAA and other agencies released 
an ‘oil budget’ which tallied the fate of all the 
released oil. Carol Browner, director of the 
White House Office of Energy and Climate 
Change Policy, announced on television that 
three-quarters of the oil was “gone”. But that did 
not match the government's own numbers. 

Later that day, Lubchenco appeared with 
Browner at a White House press conference 
and corrected the record. “It’s important to 
point out that at least 50% of the oil that was 
released is now completely gone from the sys- 
tem,’ said Lubchenco. Illustrating her statis- 
tics with a pie chart produced by NOAA and 
other agencies, Lubchenco said that contain- 
ment efforts had removed roughly a quarter 
of the oil and another quarter had either evap- 
orated or dissolved. The rest had dispersed as 
tiny subsurface droplets or as visible oil, and 
some of that had been collected from beaches 
or naturally degraded. 

But in making that correction, Lubchenco 
made a different mistake by saying that the oil 
budget had been “peer reviewed’, a statement 
at odds with the reports of scientists who sup- 
posedly reviewed it. Academics and members 
of Congress also criticized NOAA’ decision 
to release the four-page oil budget without 
uncertainty ranges or the background data 
that justified the conclusions. 

Reacting to the series of gaffes, the national 
commission investigating the oil spill declared 
in October that “the federal government cre- 
ated the impression that it was either not fully 
competent to handle the spill or not fully 
candid with the American people about the 
scope of the problem”. At the very least, those 
issues undermined the public’s trust in the 


The Deepwater Horizon disaster posed 
a brutal test for a scientist with little 
previous management experience. 


government, said the commission. 

For Lubchenco, the judgement was both 
troubling and ironic. Given her record of urg- 
ing scientists to speak out, she says, “I would be 
the last person in the world to be not valuing 
or promoting communication” She says that 
she initially baulked at the 5,000-barrel-a-day 
flow-rate statement. “My inclination was to 
correct the record, but in the grand scheme of 
things, since we didn't have the accurate num- 
bers and we were working on getting them, 
it didn’t seem to be that important relative to 
all the other stuff that was going on.” Know- 
ing how much oil was flowing would not have 
helped the effort to contain it, she argues — an 
assertion challenged by the oil-spill commis- 
sion, which says that knowledge of the true 
flow rate might have helped BP to avoid some 
problems in its attempts to cap the well. “In 
hindsight,” says Lubchenco, “it took far too 
long to come up with the eventual answer.” 

During a press conference in November, 
she also acknowledged that she had erred in 
declaring that the oil budget had been peer 
reviewed. In a subsequent interview, she took 
personal responsibility for the miscommunica- 
tion. “I misunderstood what kind of review it 
had had, so that was my mistake,” she said. 

But Lubchenco defends her agency’s state- 
ments about the subsurface plumes, saying 
that NOAA was just insisting on careful sci- 
ence. “It’s frustrating to get crosswise with my 
academic colleagues when we thought all we 
were asking them to do was to be good scien- 
tists and to double check and make sure that 
what they were finding was in fact what they 
thought it was.” 

Some scientists are still bothered by NOAA's 
slowacknowledgement of the deep oil, but others 
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agree with her approach. “There was a lot of 
speculation early on,’ says Richard Camilli of 
the Woods Hole Oceanographic Institution in 
Massachusetts, who led a cruise that uncov- 
ered signs ofa deep plume of oil in June. “Good 
science requires peer review. If you're going to 
say something public it should go through peer 
review first,’ says Camilli, who published his 
findings in Science in August’. 

Many scientists laud NOAAs overall per- 
formance during the spill. Boesch, although 
critical of Lubchenco’s initial response to 
reports of deep plumes, says that she and 
NOAA provided “very critical science sup- 
port to help direct the spill response where 
it was needed”. And he praises the agency for 
doing something that gets little mention — 
successfully keeping the nation’s seafood safe 
by closing fishing areas and reopening them 
only after rigorous testing. “That protected the 
public,’ he says, “and in the long run protected 
the industry.” 


DEFYING EXPECTATIONS 

By late October, the sheen of oil had disap- 
peared from the surface of the Gulf and 
NOAA had shifted towards assessing the 
damage. “It’s far from over,’ says Lubchenco. 
“It’s going to be years, if not decades, before 
we really understand the impact this massive 
infusion of hydrocarbons has had on this 
system.” 

In Mississippi Sound earlier that day, 
Lubchenco relished the chance to spend part 
of her weekend on the water. As a scientist, she 
has studied ocean ecosystems for 40 years — an 
unlikely focus for a girl growing up in the 1950s 
in Denver, Colorado, in the middle of the con- 
tinent. But the women in the Lubchenco family 
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have long challenged expectations. 

In the early 1900s, her paternal grandmother 
left her parents’ cotton farm in South Carolina 
to train in medicine, only to find that the dean 
of one of the nearest medical schools, in North 
Carolina, would not accept a woman. She 
finally wore him down, became the first female 
graduate in 1912 and then married a Ukrain- 
ian agricultural researcher who had visited her 
family’s farm years earlier. (He narrowly made 
it to her graduation ceremony, after having 
missed the steamer he had originally booked 
to America — the Titanic.) 

Lubchenco’ parents were also doctors, 
and her mother worked part-time so that she 
could have a career and raise her six girls. In 
that household, everybody was expected to fol- 
low their interests. “Mom and Dad were always 
great about encouraging us to explore. Of the 
six of us, we all do completely different things,” 
says Lubchenco. 

In secondary school, young Jane was a clas- 
sic overachiever: an athlete, scholar and leader, 
she won the school’s highest award. But rather 
than go to a powerhouse university, she chose 
tiny Colorado College in Colorado Springs 
and enrolled in an unusual programme with 
no classes, no grades and no tests. She discov- 
ered that she liked biology and took a summer 
class at the Marine Biological Laboratory in 
Woods Hole, Massachusetts, where she fell in 
love — with invertebrates and research. “That 
whole summer was magical for me,’ she recalls. 
“It made me decide I was going to go to grad 
school and it was going to be marine science.’ 


After getting her PhD at Harvard Univer- 
sity in Cambridge, Massachusetts, and teach- 
ing there for two years, Lubchenco took what 
some considered a step down by moving to 
Oregon State University in Corvallis, where 
she and her husband, ecologist Bruce Menge, 
bargained to split an academic position. It 
was perhaps a first in the United States, and 
it gave them both a chance to teach, conduct 
research and raise their children. The two also 
split their research on tidal communities, with 
Lubchenco studying the herbivores and sea- 
weeds and Menge the predators and prey. 

At the time, ecology was largely a descrip- 
tive science, but Lubchenco was part of a 
group pushing to introduce experimental 
approaches. In graduate school, she started 
moving herbivorous snails around tide pools 
to tease apart the factors controlling the dis- 
tribution of seaweeds. 

Most researchers had assumed the answer 
had to do with physical limitations, such 
as how much a tide pool dries out. But 
Lubchenco demonstrated that the herbivores 
had an important role in controlling the plant 
populations’ — a finding that also turned out 
to be true in some terrestrial ecosystems. Her 
simple, elegant experiments became a staple 
in ecology courses, and her papers garnered 
hundreds of citations. 

Lubchenco also made a name for herself 
by urging fellow ecologists to speak out on 
environmental issues. As vice-president of 
the Ecological Society of America in 1988-89, 
she chaired a panel that called for ecologists to 
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communicate to the public and policy-makers. 
“Tt was a coming of age for our society, to admit 
that relevance was not a four-letter word? 
recalls Lubchenco (see page 1032). Later, while 
serving as president of the American Associa- 
tion for the Advancement of Science — the 
premier scientific organization in the United 
States — in 1996-97, she continued to push 
scientists to become more socially relevant. 

Now she has a chance to bolster science and 
its connection to policy-making at the highest 
level. NOAA has a long history of conducting 
some top-notch science and has nurtured pio- 
neering researchers such as ozone specialist 
Susan Solomon and climate modeller Syukuro 
Manabe. But it has been perpetually strapped 
for cash, and previous administrations have at 
times focused less on the science than on the 
divisions that provide services, such as fore- 
casting weather and managing fisheries. 

When Lubchenco discussed the NOAA 
post with Obama soon after he was elected 
in 2008, she told him that one of her goals 
would be to renew that commitment to 
science. Obama's response to this proposal 
and others that she made, she says, was “let's 
do it”. 

Once she took office, Lubchenco set out 
to resurrect the chief-scientist position 
at NOAA, which has been vacant for 14 
years. But she got a lesson in the slow ways 
of Washington. Much to her frustration, it 
took months for the Obama administration 
to approve her choice, Scott Doney of the 
Woods Hole Oceanographic Institution, and 
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a senator this month put a block on Doney’s 
nomination to protest against the adminis- 
tration’s moratorium on offshore drilling. 
In the meantime, Lubchenco has increased 
the number of senior scientific positions at 
NOAA from 10 to 25, and altered the career 
structure within the agency so that scientists 
can advance in seniority and salary without 
having to leave research for a purely manage- 
ment position. 

Lubchenco has made significant progress 
on her other priorities, say many who have 
watched NOAA under her leadership. “She's 
done the job certainly as well — and I would 
argue better — than anyone else,’ says Andrew 
Rosenberg, a senior vice-president at Conser- 
vation International and deputy director of 
NOAAs fisheries service from 1998 to 2000. 

When Lubchenco arrived in Washington, 
one of the first problems she had to tackle 
was the National Polar-orbiting Operational 
Environmental Satellite System 
(NPOESS). Designed to collect 
weather and climate data, it was 
running years late and more than 
$5 billion over budget. Lubchenco 
and her colleagues in the admin- 
istration developed a plan to split 
the unwieldy system into a military 
part and a civilian part to be jointly 
managed by NOAA and NASA — 
a step that could finally get the 
NPOESS back on track. 

Lubchenco has also pushed 
forward an initiative to create a 
NOAA division called the Climate 
Service, which the agency had 
been discussing since just after it 
was founded in 1970. The goal is to 
gather NOAAs decentralized cli- 
mate expertise into a single office to 
enhance the science and provide an authorita- 
tive voice on climate information. The biggest 
reorganization in NOAAs history, this office 
— which awaits congressional approval — will 
give the public and businesses forecasts such as 
long-term temperature projections and flood- 
ing maps that take into account sea-level rise. 


FISHING WOES 
For environmentalists, one of the biggest suc- 
cesses of Lubchenco’s tenure so far has been 
the administration's new ocean policy, which 
Obama signed on 19 July. A centrepiece of 
the policy isa strategy — long championed by 
Lubchenco — called coastal and marine spatial 
planning, which seeks to assess and balance 
human activities in particular ocean regions 
so that they do not conflict with each other or 
harm ecosystems. In the past, the government 
has tended to manage activities such as fishing 
individually, without considering how other 
factors, such as oil drilling and coastal devel- 
opment, might interact with them. 

“What Jane has done is catalysed the 
most important transformation in ocean 


management in our history,’ says Elliot Norse, 
president of the Marine Biology Conservation 
Institute in Bellevue, Washington. 

All that change has brought some strong 
criticism, especially from the fishing indus- 
try. Under her leadership, NOAA has moved 
to implement the 2007 Magnuson-Stevens 
Reauthorization Act, which requires the 
agency to end overfishing. NOAA%S actions so 
upset some fishermen in Gloucester, Massa- 
chusetts, that they built a life-sized model of 
Lubchenco hanging fishermen. The rhetoric 
in Congress, with the calls for her resignation, 
was only slightly less inflamed. 

The source of the strife in New England 
goes back long before Lubchenco took office. 
Oversight of fishing in US federal waters is 
complicated; NOAA shares management 
duties with eight regional councils made up 
of federal and state government officials and 
members of the public, including the fishing 


Jane Lubchenco and her husband, Bruce Menge, with students in 1997. 


industry. The councils choose how they want 
to control fishing and propose annual limits 
on each type of seafood. NOAA assesses the 
plans and then approves or rejects them. 

In the past, NOAA had given management 
councils more latitude, but when Lubchenco 
took office, she made it clear that she expected 
them to meet the congressional deadline to end 
overfishing by this year. As part of that, NOAA 
last year encouraged the councils to consider 
a strategy called catch shares. In this scheme, 
councils allocate fishing ‘shares’ to individuals 
or groups, usually on the basis of how much 
they have previously caught. The recipients of 
shares can use or sell them. Proponents say that 
catch shares give fishing communities a long- 
term economic incentive to rebuild stocks. 

Although the strategy has been used around 
the world and in parts of the United States for 
decades, the transition to a catch-shares system 
can be difficult. “It has to be done very care- 
fully. It has to involve the community, from the 
bottom up,’ says Brian Rothschild, a professor 
of marine science at the University of Massa- 
chusetts at Dartmouth who has close ties to 
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the New England fishing community. He con- 
tends that NOAA and the New England Fish- 
ery Management Council moved too quickly 
in May to implement a programme based on 
catch shares, without properly involving the 
local fishing community or explaining the sys- 
tem. Some fishing communities say that the 
policy has caused major job losses. 

Lubchenco and others argue that New Eng- 
land’s policy was five years in the making and 
the community had ample time to get involved. 
They also contend that fishermen in the area 
have been struggling economically for years — 
long before the management council adopted 
the new programme. “The reality is that this 
isnt about catch shares,” says Lubchenco. “It 
really is about the economy.’ 

Peter Baker, manager of the Pew Envi- 
ronment Group’s New England overfishing 
campaign, agrees. He says that Lubchenco 
“has taken a stand to fix things for the future”. 

Those who have criticized her 
policy have not offered a viable 
alternative, he says. “I’m not sure 
that anything would be enough to 
appease her detractors.” 

As difficult as this year has been 
for Lubchenco, the next few will 
offer further challenges. NOAAs 
budget increased by 21% during 
the past two years, but Obama and 
Congress are now committed to 
cutting spending and the outlook 
for NOAA is bleak. The agency has 
never enjoyed the same support in 
Congress as some other science 
agencies, such as the National 
Institutes of Health. But Lubchenco 
thinks that the recent crises deliver 
a message on the value of NOAAs 

research and science-based man- 
agement. “It seems NOAA’ relevancy has 
been more obvious in the last couple of years,” 
she says. 

Nowhere is that clearer than out on the 
Gulf of Mexico, where signs of dead coral and 
other long-lasting effects of the oil spill are 
starting to appear. While travelling through 
the region, Lubchenco recalls that she turned 
down Obama’ transition team several times 
when she was first offered the job. Leaving 
her husband and research behind in Oregon 
seemed too big a sacrifice. But in the end, she 
says, she believed in the new president and in 
the opportunity to achieve her lifelong goals. 
“T came to NOAA to lead and enable change 
where it would make a difference,’ she later 
explained. The rough days so far have not 
discouraged her. “Meaningful change is not 
for the timid,” m SEE EDITORIAL P.1002 


Richard Monastersky is a features editor 
with Nature in Washington DC. 
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A UK crop circle, created by activists to signify uncertainty over where genetic contamination can occur. 


Keep it complex 


When knowledge is uncertain, experts should avoid 
pressures to simplify their advice. Render decision- 
makers accountable for decisions, says Andy Stirling. 
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orldwide and across many fields, 
there lurks a hidden assumption 
about how scientific expertise 


can best serve society. Expert advice is often 
thought most useful to policy when it is pre- 
sented as a single ‘definitive’ interpretation. 
Even when experts acknowledge uncer- 
tainty, they tend to do so in ways that reduce 
unknowns to measurable ‘risk’ In this way, 
policy-makers are encouraged to pursue (and 
claim) ‘science-based’ decisions. It is also not 
uncommon for senior scientists to assert that 
there is no alternative to some scientifically 
contestable policy. After years researching 
— and participating in — science advisory 
processes, I have come to the conclusion that 
this practice is misguided. 

An overly narrow focus on risk is an inad- 
equate response to incomplete knowledge. It 
leaves science advice vulnerable to the social 
dynamics of groups — and to manipulation 
by political pressures seeking legitimacy, 
justification and blame management. When 
the intrinsically plural, conditional nature 
of knowledge is recognized, I believe that 
science advice can become more rigorous, 
robust and democratically accountable. 

A rigorous definition of uncertainty can be 
traced back to the twentieth-century econo- 
mist Frank Knight’. For Knight, “a measur- 
able uncertainty, or ‘risk proper ... is so far 
different from an unmeasurable one that it 
is not in effect an uncertainty at all”. This is 
not just a matter of words, or even methods. 
The stakes are potentially much higher. A 
preoccupation with assessing risk means 
that policy-makers are denied exposure to 
dissenting interpretations and the possibility 
of downright surprise. 

Of course, no-one can reliably foresee 
the unpredictable, but there are lessons to 
be learned from past mistakes. For example, 
the belated recognition that seemingly inert 
and benign halogenated hydrocarbons were 
interfering with the ozone layer. Or the slow- 
ness to acknowledge the possibility of novel 
transmission mechanisms for spongiform 
encephalopathies, in animal breeding and 
in the food chain. In the early stages, these 
sources of harm were not formally charac- 
terized as possible risks — they were ‘early 
warnings offered by dissenting voices. Policy 
recommendations that miss such warnings 
court overconfidence and error. 

The question is how to move away 


EMBER 2010 | VOL 468 | NATURE | 1029 


G. GRAF/GREENPEACE 


> from this narrow focus on risk to broader 
and deeper understandings of incomplete 
knowledge. Many practical quantitative 
and qualitative methods already exist (see 
‘Uncertainty matrix’), but political pres- 
sure and expert practice often prevent them 
being used to their full potential. Choosing 
between these methods requires a more 
rigorous approach to assessing incomplete 
knowledge, avoiding the temptation to treat 
every problem as a risk nail, to be reduced 
by a probabilistic hammer. Instead, experts 
should pay more attention to neglected areas 
of uncertainty (in Knight’s strict sense) as 
well as to deeper challenges of ambiguity and 
ignorance’. For policy-making purposes, the 
main difference between the ‘risk methods 
shown in the matrix and the rest is that the 
others discourage single ‘definitive’ policy 
interpretations. 


ANY JUSTIFICATION 

There are still times when ‘risk-based’ 
techniques are appropriate and can yield 
important information for policy. This can 
be so for consumer products in normal use, 
general road or airline-safety statistics, or 
the epidemiology of familiar diseases. Yet 
even in these seemingly familiar and 
straightforward areas, unforeseen pos- 
sibilities, and over-reliance on aggre- 
gation, can undermine probabilistic 
assessments. There is a need for humil- 
ity about science-based decisions. 

For example, consider the risk 
assessment of energy technologies. 
The other graphic (see “The perils of 
‘science-based’ advice’) summarizes 
63 studies on the economic costs aris- 
ing from health and environmental 
impacts of different sets of energy tech- 
nologies. The aim of the studies is to 
help policy-makers identify the options 
that are likely to have the lowest impact. 
This is one of the most sophisticated 
and mature fields for quantitative risk- 
based comparisons. Individual policy 
reports commonly express their find- 
ings as if there were little room for 
doubt. Many of the studies present no 
— or tiny — uncertainty ranges. But 
taken together, these 63 studies tell a 
very different story’ — one usually 
hidden from policy-makers. The dis- 
crepancies between equally authoritative, 
peer-reviewed studies span many orders of 
magnitude, and the overlapping uncertainty 
ranges can support almost any ranking order 
of technologies, justifying almost any policy 
decision as science based. 

This is not just a problem with quantita- 
tive analysis. Qualitative science advice is 
also usually presented in aggregated and 
consensual form: there is always pressure 
on expert committees to reach a ‘consensus’ 
opinion. This raises profound questions over 


Knowledge about probabilities 


Unproblematic 


Problematic 


what is most accurate and useful for policy. Is 
it a picture asserting an apparent consensus, 
even where one does not exist? Or would it 
be more helpful to set out a measured array 
of contrasting specialist views, explaining 
underlying reasons for different interpreta- 
tions of the evidence? Whatever the political 
pressures for the former, surely the latter is 
more consistent both with scientific rigour 
and with democratic accountability? 

I believe that the answer lies in supporting 
more plural and conditional methods for sci- 
ence advice (the non-risk quadrants shown 
in ‘Uncertainty matrix’). These are plural 
because they even-handedly illuminate a 
variety of alternative reasonable interpreta- 
tions. And conditional because they explore 
explicitly for each alternative, the associated 
questions, assumptions, values or inten- 
tions*. Under Knightian uncertainty, for 
instance, pessimistic and optimistic inter- 
pretations can be treated separately, each 
explicitly associated with assumptions, dis- 
ciplines, values or interests so that these can 
be clearly appraised. It reminds experts that 
absence of evidence of harm is not the same 
as evidence of absence of harm. It also allows 
scenario analysis and the consideration of 
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sensitivity, enabling more accountable evalu- 
ation. For example, it could allow experts to 
highlight conditional decision rules aimed at 
maximizing best or worst possible outcomes, 
or ‘minimizing regrets”. 

The few sporadic examples of the appli- 
cation of this approach show that it can be 
practical. One particularly politicized and 
high-stakes context for expert policy advice 
is the setting of financial interest rates. The 
Bank of England’s Monetary Policy Commit- 
tee, for example, describes its expert advisory 
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process as a “two-way dialogue” — with a 
priority placed on public accountability. 
Great care is taken to inform the commit- 
tee, not just of the results of formal analysis 
by the sponsoring bodies, but also of com- 
plex real-world conditions and perspectives. 
Reports detail contrasting recommendations 
by individual members and explain reasons 
for differences®. Why is this kind of thing not 
normal in science advice? 

When scientists are faced with unmeas- 
urable uncertainties, it is much more usual 
for a committee to spend hours negotiating 
a single interpretation across a spread of con- 
tending contexts, analyses and judgements. 
From my own experiences of standard- 
setting for toxic substances, it would often 
be more accurate and useful to accept these 
divergent expert interpretations and focus 
instead on documenting the reasons. In my 
view, concrete policy decisions could still 
be made — and possibly more efficiently. 
Moreover, the relationship between the 
decision and the available science would be 
clearer and the inherently political dimen- 
sions more honest and accountable. 

Problems of ambiguity arise when experts 
disagree over the framing of possible options, 

contexts, outcomes, benefits or harms. 
Like uncertainty, these cannot be 
reduced to risk analysis, and demand 
plural and conditional treatment. Such 
methods can highlight — rather than 
conceal — different regulatory ques- 
tions, such as: “what is best?”, “what 
is safest?”, “is this safe?”, “is this toler- 
able?” or (as is often routine) “is this 
worse than what we have now?” Nobel- 
winning work in rational choice shows 
that when ambiguity rules there is no 
guarantee, as a matter of logic, that 
scientific analysis will lead to a unique 
policy answer’. Consequently, defini- 
tive science-based decisions are not 
just potentially misleading — they area 
fundamental contradiction in terms. 


METHODS THAT WORK 

One practical example of ways to be 
plural and conditional when consid- 
ering questions and options, as well 
as in deriving answers, is multicri- 
teria mapping. Other participatory 
and deliberative procedures include 
interactive modelling and scenario work- 
shops, as well as Q-method and dissensus 
methods. Multicriteria mapping makes use 
of simple but rigorous scoring and weight- 
ing procedures to reveal the ways in which 
overall rankings depend on divergent ways 
of framing the possible options. In 1999, 
Unilever funded me and colleagues to use 
multicriteria mapping to study the perspec- 
tives of different leading science advisers on 
genetically modified (GM) crops*. The back- 
ing of this transnational company helped 
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A survey of 63 peer-reviewed studies of health and environmental risks associated with energy technologies. 
Individual studies offer conclusions with surprisingly narrow uncertainty ranges, yet together the literature 


offers no clear consensus for policy makers. 
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draw high-level UK government attention. 
A series of civil servants told me, in quite 
colourful terms, that results mapped out in 
plural, conditional fashion would be “abso- 
lutely no use” in practical policy-making. Yet 
when a chance finally emerged to present 
results to Mo Mowlam, the relevant cabinet 
minister, the reception was very positive. 
She immediately appreciated the value of 
having alternative perspectives laid out fora 
range of policy options. It turned out in this 
case, that the real block to a plural, condi- 
tional approach was not the preferences of 
the decision-maker herself, but of some of 
those around her. 

In my experience, it is the single defini- 
tive representations of science that are most 
vulnerable to political manipulation. Plural, 
conditional approaches are not immune, but 
they can help make political pressures more 
visible. Indeed, this is what happened dur- 
ing another GM policy process in which I 
was involved: the 2003 UK science review of 
GM crops. Reporting included explicit dis- 
cussion of uncertainties, gaps in knowledge 
and divergent views — and was described as 
“neither a red nor a green light” for GM tech- 
nology. A benefit of this more open approach 
is that it helped GM proponents and critics 
to work more effectively together during the 
committee deliberations, without a high- 
stakes, ‘winner takes all’ dynamic. There was 
more space to express alternate interpreta- 
tions, free from implications that one party 
or another was wrong. This is important in a 
highly-politicized area such as GM science, 
where there are entrenched interests on both 
sides. Yet this unusual attempt to acknowl- 
edge uncertainty was not universally popu- 
lar. Indeed, it was also the only occasion, to 
my knowledge, on which the minutes of a 
UK science advisory committee formally 


documented covert attempts to damage 
the career of one of its members (me, in 
this case)’. Perhaps for political — rather 
than scientific — reasons, this experiment 
towards plural and conditional advice has 
not been repeated. 

A further argument for using more plural 
approaches arises from the state of igno- 
rance, in which ‘we don’t know what we 
don't know’ Ignorance typically looms in 
the choice of which of a range of feasible, 
economically viable future paths to support 


simplistic or cynical support for some par- 
ticular favoured direction of change that is 
backed on the spurious grounds that it is 
somehow synonymous with ‘sound science, 
or uniquely ‘pro innovation. 

Instead, plural, conditional advice helps 
enable mature and sophisticated policy 
debate on broader questions. How reversible 
are the effects of a particular path, if we learn 
later that it was ill-advised? How flexible are 
the associated industrial and institutional 
commitments, allowing us later to shift 
direction? How adaptable are the innovation 
systems? What part might be played by the 
deliberate pursuit of diverse approaches — 
to hedge ignorance, defend against lock-in 
or foster innovation — in any given area? 

Thus, such advice provides the basis for 
a more-equal partnership between social 
and natural science in policy advice. Plural 
and conditional advice may also help resolve 
some polarized fault-lines in current debates 
about science in policy. It shows how we 
might better: integrate quantitative and 
qualitative methods; articulate ‘risk assess- 
ment and ‘risk management’; and reconcile 
‘science-based’ and ‘precautionary appraisal’ 
methods. 

A move towards plural and conditional 
expert advice is not a panacea. It cannot 
promise escape from the deep intractabilities 
of uncertainty, the perils of group dynamics 
or the perturbing effects of power. It differs 
from prevailing approaches in that it makes 
these influences more rigorously explicit and 
democratically accountable. m 


— either through funding or regulation 
for emerging technologies. In a finite and 
globalizing world, no single path can be fully 
realized without detracting from the poten- 
tial for others. Even in the most competitive 

consumer markets, for 
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AC electricity or light- 
water reactors. This is not evidence of inevi- 
tability, but of the ‘crowding out’ of potential 
alternatives. Likewise, locking-in occurs in 
the prioritizing of certain areas of scientific 
enquiry over others. The paths taken by 
scientific and technological progress are far 
from inevitable. Deliberately or blindly, the 
direction of progress is inherently a matter 
of social choice’®. 

A move towards plural, conditional advice 
would help avoid erroneous ‘one-track, 
‘race to the future’ visions of progress. Such 
advice corrects the fallacy that scepticism 
over a specific technology implies a general 
‘anti-science’ sentiment. It defends against 
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COMMENT 


Stand up for 
science 


This year showed that good communication can make 
you a leader, and a better scientist, says Nancy Baron. 


ack in 2001, I sat at the rear of a 
B classroom with Jane Lubchenco, 

co-founder of the Aldo Leopold 
Leadership Program, while scientists stepped 
forward to share their fears and failures con- 
cerning communicating with the media 
and policy-makers. “I get a lot of calls from 
the press, and I don’t return most of those 
calls,” confessed Margaret Palmer, a restora- 
tion ecologist at the University of Maryland 
in College Park. A wave of sympathetic 


laughter rippled through the audience. 
After that two-week communications 
training workshop, Palmer decided to change 
her ways. Earlier this year, she co-authored 
a paper challenging US government poli- 
cies that allow irreversible ecological dam- 
age through mountain-top mining in the 
pursuit of cheap coal’. An avalanche of 
attention included an invitation to appear 
on the satirical television show, The Colbert 
Report. This time Palmer returned the call. 
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Despite Stephen Colbert’s bombastic efforts 
to disarm her, Palmer laughed, leaned in and 
scored a series of carefully prepared points 
while 1.2 million viewers watched. 

Palmer has become well known not justas a 
scientist, but as a leader. Her prominence has 
helped the University of Maryland become 
the finalist, pending formal approval by the 
review board, for a prestigious US National 
Science Foundation-funded (environmental 
synthesis research) centre to produce policy- 
relevant science with the active participation 
of decision-makers. In other words, science 
designed to make a difference. 

This year, more than ever before, a chorus 
of voices has been summoning scientists to 
emerge from their laboratories and become 
better communicators. Little has been said 
about one important reason for doing so: 
the intrinsic link between communication 
and leadership. It’s no coincidence that envi- 
ronmental scientists who lead the pack, both 
within academia and beyond, are good com- 
municators. These scientists know how to 
articulate a vision, focus a debate and cut to 
the essence of an argument. They can make a 
point compelling, even to those who disagree. 
They talk about their science in ways that 
make people sit up, take notice and care. After 
a decade of working with scientists as acom- 
munications coach and trainer, I am encour- 
aged by the increasing number of scientists 
who are now chiselling doors and windows 
in the ivory tower to reach out. A new breed 
of communication-savvy researchers is 
emerging — albeit perhaps not fast enough. 

For scientists who would be agents of 
change, communication is not an add-on. 
It is central to their enterprise. They begin 
with a goal in mind, frame their research 
questions to produce useful results and 
think about how they will disseminate the 
information. Yet learning to communicate is 
acritical life skill not typically taught as part 
of scientific training. It should be. 


SPOTLIGHTS OR HEADLIGHTS? 

This year, during the ‘Climategate’ affair, 
climate scientists froze in the face of scan- 
dal, only to become the pijiatas of sceptics 
and deniers. Bashing these scientists con- 
tinues to be a favourite pastime of the Tea 
Party politicians in the United States, despite 
those involved being cleared of wrongdoing 
by several independent review panels. 
Any vindication has been largely ignored 
because, as Mark Twain purportedly said: 
“A lie can make it half way around the world 
before the truth has time to put its boots 
on.” Now, after losing ground in the court of 
public opinion, climate scientists are finally 
rallying — stepping up to answer questions, 
address misconceptions and actively counter 
misinformation and deception’. One group 
of scientists has set up a rapid-response team 
promising quick turnaround to queries from 
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government officials or the media’. The 
American Geophysical Union relaunched 
a climate question-and-answer service for 
the United Nations climate talks in Canctin, 
Mexico, earlier this month — to address 
questions of science, not policy*. 

These are valuable steps to try to ensure 
scientific accuracy in the face of heated polit- 
ical rhetoric and wild conspiracy theories. 
But alone, they aren't enough. It’s important 
to remember that not answering what pol- 
icy-makers want and need to know leaves 
a void — one that contrarians are only too 
happy to fill. | concur with the late Stanford 
University climatologist Stephen Schneider's 
view: “Staying out of the fray is not taking the 
‘high ground’; it is just passing the buck.” He 
believed that it is both possible and impor- 
tant to comment on policy without compro- 
mising scientific integrity. He would often 
say: “If you are asking me as a scientist, I 
would answer it this way ... Ifyou are asking 
measa citizen, I would say...” In this way he 
made his point without overstating his sci- 
ence, and became extremely influential. 

The Deepwater Horizon oil spill in the Gulf 
of Mexico illustrates how other scientists who 
have devoted time to thinking about commu- 
nication have risen to positions to help lead 
policy. Lubchenco, now the administrator 
of the National Oceanic and Atmospheric 
Administration (NOAA), was an early 
advocate for scientists to communicate (see 
page 1024). In her call to arms — a 1998 paper 
in Science’ — she entreated scientists to be 
more forthcoming and share their research to 
benefit government, managers, policy-makers 
and society at large. Next she helped launch 
the Aldo Leopold Leadership Program and 
the Communication Partnership for Science 
and the Sea (COMPASS). Both of these initia- 
tives help scientists connect with the media 
and policy-makers and deliver a bottom line 
to those with little time or patience. 

As the first marine ecologist to lead 
NOAA, an agency of about 12,800 employ- 
ees, Lubchenco knew her task was daunt- 
ing. The oil spill provided a real test. Even 
this veteran communicator could not 
control how the media presented mixed 
messages and rapidly unfolding events. 
In August, Lubchenco was criticized for 
painting too rosy a picture of how fast the 
oil was being dispersed. Her message of “do 
not prematurely prejudge the impacts” was 
lost in the media clips. Lubchenco perse- 
vered, consistently reiterating what was 
and wasn't known about the oil, its effects 
and its final fate. By November the message 
was picked up. Her experience gave her the 
patience and persistence needed. 

Scientists with a his- 
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spill’s ongoing saga. Donald Boesch, presi- 
dent of the University of Maryland Center 
for Environmental Science, is one of only two 
scientists on US President Barack Obama's 
seven-member commission on the Gulf of 
Mexico oil spill and offshore drilling. He 
was probably chosen from many qualified 
scientists because of his communication 
skills. Boesch is known for his ability to talk 
to people — from all walks of life — in a way 
that compels them to act. He is sympathetic, 
analytical and adaptive rather than superior, 
doctrinaire and inflexible. And he readily 
admits that he learns from his failures as well 
as from his successes. 

Boesch has taken criticism from some 
peers for being too much in the public eye. 
He says the rewards of knowing that he is 
making a difference are worth it. On numer- 
ous occasions, a governor has told him about 
a recent piece of scientific work in the news, 
not realizing that Boesch had brought it to 
the media's attention in the first place. Boe- 
sch knows that the media helps to set the 
agenda of policy-makers and the public, and 
uses that system accordingly. Boesch hopes 
he can help guide the commission with a 
rigorously documented report that recom- 
mends actions to improve human and envi- 

ronmental safety. But 
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priority sometimes 
brings turbulence to 
academic careers. When something gets 
widely reported, the subsequent discussion 
in talk radio, television and the blogosphere 
can distort the facts like a funhouse mirror. 
Defending oneself can eat up valuable hours. 
Attacks can come from industry, ideologues 
or even colleagues. 

Boris Worm, a marine ecologist from 
Dalhousie University in Halifax, Canada, for 
example, faced critiques that he had ‘over- 
reached’ his results in two papers®” about 
fish depletion that got a lot of public atten- 
tion. Instead of getting defensive, he engaged 
with his critics — and ultimately ended up 
collaborating with them’. 

Most scientists I know who have felt such 
backlashes have few regrets. They dust them- 
selves off and respond with more and better 
science. Their concern for the environment 
trumps their fear of criticism, and the progress 
they see in policies justifies their efforts. 

Not every scientist wants to step up to the 
microphone — nor do they all need to. But 
for those who aim to change the world — 
and many graduate students and postdocs 
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do — some changes to the academic system 
would help. If young scientists are going to 
hone communication skills, they need the 
support of senior scientists to protect their 
interests and reputations at crucial junctures 
in their careers. In choosing an adviser, they 
should align themselves with scientists who 
have solid credentials and who share their 
values about outreach. Increasingly, many 
senior scientists are developing communica- 
tion courses for their students that range from 
one-day workshops to accredited courses. 


TIME WELL SPENT 

In my work with scientists, I often hear that 
they cannot afford the time to work on their 
communication skills, with their hectic, 
research, publishing and teaching schedules. I 
see it another way: they cannot afford not to. 

Many of the most prolific and accom- 
plished scientists have risen to the top of 
their field by conducting significant, relevant 
research and working out how to commu- 
nicate it within their discipline and beyond. 
They know the value of being quizzed by 
Congress or the media, even if at times it 
can be uncomfortable. Going public forces 
them to distil the essence of their work and 
to think harder about the questions — what 
is known and what is left to discover. Worm’s 
philosophy is that engaging with thought- 
ful criticism — even if it seems harsh in the 
media spotlight — “makes everyone think 
more deeply and makes us push harder 
against the limits of the unknown”. 

That’s why sharpening communication 
skills has value beyond increasing public 
understanding. It can breach interdiscipli- 
nary boundaries within science and help 
colleagues with different viewpoints catch 
a glimpse of a bigger picture. Articulating 
vision and common goals has long been a 
cornerstone of leadership on the battlefield. 
Scientists would be wise to adopt a similar 
strategy. Being a good communicator is nota 
trade-off. It makes you a better scientist. m 
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The future of the Adélie penguin hangs in the balance as sea-ice loss in the Antarctic threatens the supply of staple prey such as krill. 


CONSERVATION 


After the ice 


Yvon Le Maho is moved by a powerful account of the demise of the Adélie penguin. 


he impact of human activities on 
"resis is slow and insidious. 
Documenting the dramatic drop in 
populations of the Adélie penguin that has 
accompanied sea-ice loss and glacier retreat 
over the past three decades in the Antarctic, 
Fraser's Penguins reveals the profound envi- 
ronmental changes that are afoot. 
Award-winning journalist Fen Montaigne 
spent five months in Antarctica tracking pen- 
guins with ecologist Bill Fraser and his team. 
Fraser, a regular visitor to the US scientific 
station of Palmer in the northwest Antarctic 
peninsula since 1974, has witnessed the site 
change froma polar ice habitat into a milder 
sub-Antarctic environment. Resisting the 
temptations of a quick research payoff, he 
began some of the first long-term studies 
on Antarctic seabird species, including the 
Adélie penguin (Pygoscelis adeliae). Such 
extended monitoring is an essential tool for 
assessing the health of regional ecosystems. 
As a researcher who visits Antarctica 
regularly, I found Montaigne’s account excep- 
tionally poignant. He voices the emotions 
that inundate everyone who works in this 


vast wilderness. And 
he captures details 
such as the fantastic 
scenery as the boat 
picks its way through 
broken sea ice dotted 
with resting seals and 
groups of penguins 
squint-eyed under a 
dazzling light. This 
was especially touch- 


Fraser’s Penguins: 


ing because I read the vregechier | . 
book while rolling at Antarctica 


seain the company of FEN MONTAIGNE 


wandering albatrosses, — Henry Holt: 2010. 
on the way to my own 288 pp. $26 
penguin study site. 


Montaigne reminds us why the Adélie 
penguins, those “smart and fussy little men 
in evening clothes’, fascinated the first 
Antarctic explorers such as Fabian von 
Bellingshausen, Ernest Shackleton, Roald 
Amundsen, Robert Scott and Edward 
Wilson. Those men had to fight against 
the cold to conquer the unforgiving Terra 
Australis Incognita, or ‘unknown southern 
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land. Yet the diminutive Adélie penguins 
thrive in these harsh conditions thanks to 
a unique suite of adaptations. 

Penguins have evolved layers of overlap- 
ping scale-like feathers and large reserves of 
energy-giving body fat, which allow them 
to swim through the icy waters and stand 
through fierce storms. After heavy snow falls 
on their breeding colonies, their heads may be 
barely visible, sticking out of breathing holes 
as they continue incubating their eggs. 

Visitors a century ago would probably have 
seen huge colonies of Adélie penguins. At 
the start of Fraser’s study, in the 1970s, there 
were more than 30,000 breeding pairs on the 
seven monitored islands around Palmer sta- 
tion. Populations there today have dropped 
by 80%. Shifting weather and snow patterns, 
the contraction of sea ice and the retreat of 
glaciers have impacted ecosystems through 
a cascade of effects along the food web. The 
Adélie penguins’ exist- 
ence is intertwined 
with the presence of 
sea ice, as they for- 
age on ice-dependent 


Hybridization of 
Arctic species: 
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prey such as silverfish and krill. 

Sea ice serves as a grazing area for juve- 
nile krill, who rake free the life-sustaining 
diatoms and phytoplankton embedded in 
the frozen ocean. As sea ice has declined 
along the western Antarctic peninsula, krill 
populations have dropped by up to 80% 
since 1976. Fisheries exacerbate the scarcity, 
as demand for krill in the aquaculture and 
pharmaceutical industries increases. 

Montaigne gives an accurate portrayal of 
the breeding cycle and habits of the Adélie 
penguin. With their late maturity, low fecun- 
dity and extended generations, long-lived 
organisms such as penguins (and Arctic 
polar bears) are particularly sensitive and 
thus vulnerable to rapid environmental 
changes and extreme events. For example, my 
colleagues and I have shown that an increase 
of only 0.3°C in sea-surface temperature in 
the marginal sea-ice zone leads to a 10% drop 
in the survival rate of king penguins. 

In revealing the tragic fate of the Adélie 
penguin, Montaigne has found an effec- 
tive way to communicate the impact of 
human-induced global climate change. He 
ably explains complex climate mechanisms, 
such as how shifts in atmospheric circula- 
tion patterns like the Arctic Oscillation can 
pump warmer air into the polar regions from 
lower latitudes, and why some parts of the 
Antarctic are becoming colder when most 
of the peninsula's glaciers are in retreat and 
massive ice shelves are collapsing. 

Fraser forecasts that, in the next decade, the 
Adélie penguins around Palmer will become 
memories. Rising temperatures around the 
Antarctic are pushing specialized polar spe- 
cies such as the Adélie to regional extinction. 
Two sub-Antarctic penguin species, gentoos 
and chinstraps, seem to be benefiting from 
climate change by expanding south. Yet 
they too depend on krill and winter sea ice. 
Although these two species might be able 
to shift their diet to other prey — gentoos 
dive deeper than the Adélie and chinstraps 
can feed at night — the ongoing ecosystem 
upheaval will jeopardize all penguins’ exist- 
ence in the near future. 

Having worked in a more southerly part of 
Antarctica not yet so transformed by global 
warming, I found the book a piercing cry 
of alarm. I realize that Iam lucky to have 
had the chance to contemplate the ecology 
of a pristine polar environment. As Fraser's 
Penguins shows, the beginning of the cata- 
strophic consequences of global warming 
are only just visible now. The next genera- 
tion of scientists may witness these changes 
accelerating at a dramatic pace. m 
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= Dog, Inc.: The Uncanny Inside Story of Cloning Man’s Best Friend 
YOHN WOESTENDIe, John Woestendiek AVERY 320 pp. $26 (2010) 
Pet cloning is big business. Investigative reporter John Woestendiek 
looks behind the scenes at the emerging industry of commercial 


DOG, INC. | dog cloning. It started in 2008 with a pitbull called Booger, whose 
SS P American owner loved him so much she paid US$50,000 to a South 
‘i Korean firm to produce a litter of his identical offspring. Woestendiek 
Ae suggests that the ethics of dog cloning is driven as much by our love 


of man’s best friend as by the underlying science. He asks whether 
4 our obsession with animals makes us more likely to transgress 
ethical boundaries. 


Swallow: Foreign Bodies, Their Ingestion, Inspiration, and the 
Curious Doctor Who Extracted Them 

Mary Cappello THE NEW PRESS 336 pp. $27.95 (2010) 

Coins, jewellery, a padlock, a toy goat — people ingest the strangest 
things. Focusing on items rescued from patients’ stomachs, award- 
winning writer Mary Cappello explores the psychology of why people 
eat non-nutritional objects. Her book centres on physician Chevalier 
Jackson’s collection of swallowed artefacts in Philadelphia’s Mutter 
Museum. Through the tales behind that exhibit, she unearths a 
history of class and poverty that compelled boys to swallow their last 
coins, and explores colourful characters such as sword swallowers. 


— : Ourselves Unborn: A History of the Fetus in Modern America 

Sara Dubow OXFORD UNIVERSITY PRESS 320 pp. $29.95 (2010) 

We attach to the fetus a host of meanings — political, cultural and 

scientific. Historian Sara Dubow argues that these are largely based 

on our notions of identity, authority and sexuality, rather than fact 

or theology. She examines how these meanings have changed 

throughout history. Since the late nineteenth century, the fetus has 
been at the centre of a tug of war between science and religion. 

j Although technology brought a greater understanding of embryo 
development in the twentieth century, social change has also made 
the fetus the subject of controversy. 


lif 


America Identified: Biometric Technology and Society 

Lisa S. Nelson THE MIT PRESS 200 pp. $32 (2010) 

Biometric technologies — such as fingerprint sensors, retina 

scans and handwriting analysis — are increasingly used to identify 
individuals. Drawing on research with focus groups, political scientist 
Lisa Nelson explores public attitudes to surveillance. She describes 
how public users of these technologies are sensitive to issues of 
privacy, trust and confidence in the institutions that acquire it. The 
a expansion of these identification methods by governments through 
history, she explains, has bred distrust in biometrics, and highlights 
the need to balance harm, prevention and liberty. 


A 
MERICA ‘DEN TIFIEO 


S| 


Virtual Teamwork: Mastering the Art and Practice of Online 
Learning and Corporate Collaboration 

Edited by Robert Ubell WILEY 268 pp. $49.95 (2010) 

Scientists increasingly work and teach in collaborations that have 
remote members. This collection of expert perspectives, edited by 
enterprise-learning professor Robert Ubell, offers a practical guide 
to virtual teamwork. It explains how to communicate across borders 
of geography, culture and motivational style to manage productive 
exchanges between participants. The essays offer advice on running 
online class projects and detail the latest virtual team technology. 
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IN RETROSPECT 


Pauling’s primer 


Linus Pauling’s book on bonding brought quantum 
mechanics into practical chemistry, finds Philip Ball. 


inus Pauling’s The Nature of the 
[_ceemet Bond has, like Isaac New- 
ton’s Principia or Charles Darwin's On 
the Origin of Species, the kind of iconic status 
that, for some, removes any obligation to read 
it. Every chemist learns of Pauling’s role in 
uniting the view of molecules as assemblies 
of atoms with the quantum-mechanical pic- 
ture of atomic wave functions. But his book 
is long and mathematical, and more versa- 
tile approaches have emerged since it was 
published in 1939. Nevertheless, more than 
70 years on, as we prepare for the Interna- 
tional Year of Chemistry 2011, it remains a 
surprisingly good primer on chemical bond- 
ing that translates abstract quantum theory 
into the practical language of chemistry. 
When Pauling’s book was first published, 
some textbooks were still presenting an 
essentially nineteenth-century view of the 
bond. The term was introduced in 1866 by 
English chemist Edward Frankland, who 
regarded the chemical bond as a force akin 
to gravity. Jons 
Jakob Berzelius 
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tion of oppositely 
charged ions. That 
view was encour- 
aged by the discovery of the electron in 1897, 
as ions could result from an exchange of elec- 
trons between atoms. 

But G. N. Lewis at the University of Cali- 
fornia, Berkeley, argued that bonding may 
result from sharing, not exchange, of elec- 
trons. Shared electrons give rise to what Irving 
Langmuir later called a covalent bond, which 
links neutral atoms. In 1916, Lewis suggested 
that atoms are stabilized by having a full ‘octet’ 
of electrons, which he saw as the corners of a 
cube; the octet may be completed by linking 
corners or edges with adjacent atoms. The 
model, which was popularized (or, in Lewis's 
bitter view, appropriated) by Langmuir, 
seemed vindicated when physicist Niels Bohr 
explained how the octets arise from quantum 
theory as discrete electron shells. 

Yet because it considered only the indi- 
vidual atoms, this remained a rudimentary 
grafting of quantum theory on to the concepts 
that chemists used to rationalize molecular 


chemists’ book.” 


formulae. Pauling, The Nature of the 
a supremely gifted Chemical Bond 
LINUS PAULING 


young man from a 
poor family in Oregon 
who won a scholar- 
ship to the prestigious California Institute 
of Technology (Caltech) in 1922, was con- 
vinced that chemical bonding needed to be 
understood from quantum first principles. 
He wasnt alone — Richard Tolman at Caltech 
notably held the same view. But Pauling had 
a golden opportunity to develop it when, in 
1926, he went to Europe on a Guggenheim 
fellowship to visit the architects of quantum 
theory: Bohr at Copenhagen, Arnold Som- 
merfeld at Munich and Erwin Schrédinger 
at Zurich. He also met Fritz London and his 
student Walter Heitler, who in 1927 published 
their quantum-mechanical description of 
the hydrogen molecule. They had found an 
approximate way to write the wave function 
of the molecule that, when inserted into the 
Schrédinger equation, allowed them to calcu- 
late a binding energy that was in reasonable 
agreement with experiment. 

Pauling generalized this treatment into 
the valence-bond model, which considers 
chemical bonds to be formed by the over- 
lap of single-atom electron orbitals. He put 
forward the idea of ‘resonance’ in molecules 
for which more than one valence-bond struc- 
ture can be drawn; for example, in H,” the 
single electron can be considered to reside 
on either hydrogen atom, and the molecule 
is said to resonate between the alternatives. In 
such cases, the mixed state has a lower energy 
than any of the contributing structures. 


Cornell University 
Press: 1939. 429 pp. 


ANEW GEOMETRY 
Pauling also proposed that ‘hybrid’ blends of 
atomic electron orbitals with new geometries 
may arise in some molecules. In methane, for 
example, the central carbon atom attaches to 
four hydrogen atoms in a tetrahedral shape; 
this configuration can be rationalized as 
the mathematical mixing of the atomic 2s 
and three 2p orbitals in carbon to give four 
tetrahedrally distributed sp’ hybrid orbit- 
als. These ideas on resonance and hybridi- 
zation were published in a series of papers 
in 1928-31, which formed the core of The 
Nature of the Chemical Bond. 

The book's scope is exhaustive. It brings 
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multiple covalent bonds and ionic, metallic 
and hydrogen bonds all within the valence- 
bond framework, and explains how the ideas 
fit with observations of bond lengths and 
ionic sizes in X-ray crystallography — the 
technique Pauling mastered at Caltech that 
led to his seminal work in the early 1950s on 
the structure of proteins and nucleic acids. 

Pauling acknowledged that his treatment 
of chemical bonds is ultimately arbitrary 
— his resonating configurations of nuclei 
and electrons are conceptual fictions that 
allow us to estimate the molecule’ energy. It 
is the wave function of the entire molecule 
that actually describes how the electrons are 
distributed in space. But he was resistant to 
recognizing alternative theories. 

In particular, in the late 1920s, Robert 
Mulliken at the University of Chicago, 
Illinois, and Friedrich Hund at the University 
of Gottingen, Germany, approximated the 
electron wave functions in a different way 
to Pauling’s valence-bond theory, giving rise 
to ‘molecular orbitals’ in which electrons are 
distributed over several nuclei. Their model 
offered a simpler account of the quantum 
energy levels as revealed by molecular 
electronic spectra. And it supplied a single 
description of some molecules for which 
the valence-bond approach had to invoke 
resonance between many structures. This 
was especially true for aromatic molecules 
such as benzene: the valence-bond model 
needed around 48 separate structures for 
naphthalene (two fused benzene rings), and 
no fewer than 560 for the organometallic 
compound ferrocene. 

Although neither the valence-bond nor 
the molecular-orbital model could claim to 
be more correct than the other, the latter had 
practical advantages. This was known even by 
reviewers of Pauling’s book — some criticized 
him for not mentioning the rival theory, and 
one reviewer suspected that the valence-bond 
method might triumph purely because of 
Pauling’s superior presentation skills. By the 
1970s, however, most chemists accepted that 
molecular-orbital theory was usually more 
convenient, although Pauling never did. 

The significance of The Nature of The 
Chemical Bond was not so much that it 
pioneered the quantum-mechanical view 
of bonding, but that it made this a chemical 
theory: a description that chemists could 
understand and use, rather than a mathe- 
matical account of wave functions. It recog- 
nized that, ifa model of physical phenomena 
is to be useful, it needs to accommodate itself 
to the intuitions and heuristics that enable 
scientists to talk coherently about the prob- 
lem. Emerging from the forefront of physics, 
this was nevertheless a chemists’ book. = 


Philip Ball is a writer based in London. 
His forthcoming book is Unnatural: The 
Heretical Idea of Making People. 


This detail from a conceptual model of the evolution and structure of science shows emerging fields in 
blue, funding boosts in yellow and gaps in knowledge as black voids. 


COMMUNICATION 


Mapping science 


Ben Shneiderman enjoys a tome full of tools for discovery. 


he desire to visualize science is intense. 

Whereas telescopes, microscopes and 

magnetic resonance imaging (MRI) 
scans have revealed aspects of the natural 
world, new tools are needed to study science 
itself and how it changes over time. The chal- 
lenge of depicting intangible processes has 
invigorated the growing research community 
dedicated to information visualization. From 
capturing moments of discovery to watching 
emerging research fronts, such tools can help 
us to understand the dynamics ofinnovation 
and guide its future. 

In the Atlas of Science, information scientist 
Katy Borner highlights examples that summa- 
rize the evolution of research and its interlock- 
ing communities in pictorial form. The book 
accompanies Bérner’s ambitious travelling 
exhibitions, Places & Spaces: Mapping Science, 
an ongoing programme of well-crafted visual 
presentations that have conveyed aspects of 
science to the public in libraries and museums 
since 2005 (http://scimaps.org). Contributors 
to the book get bylines and photos, making the 
collection a collaborative effort with diverse 
voices. Each two-page spread is a sumptuous 


Atlas of Science: 
Visualizing What 
We Know 

KATY BORNER 

MIT Press: 2010. 
288 pp. $29.95 


feast of dense prose, 
delicious visuals and 
engaging quotations. 
Borner’s use of map- 
making as metaphor is 
mostly on target, but it 
underemphasizes the inherently interactive 
nature of information visualization. 

Unlike in scientific visualization, which 
centres on three-dimensional representa- 
tions of objects such as stacked MRI scans, 
researchers who visualize information seek 
patterns, clusters, relationships, gaps and 
anomalies in many dimensions. Such methods 
may be used, for example, to study financial 
trading patterns over time, hierarchical struc- 
tures in library catalogues, networks of social 
relationships and medical patient attributes. 
In exploring these multi-dimensional spaces 
with bespoke software, users manipulate con- 
trol panels to zoom in on desired items, filter 
out undesired items 


and select details. D> NATURE.COM 
The past decadehas See the Big Data 

produced a steady flow _ specialissue: 

of prototype software —_go.iattire.com/vuklyt 
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for information visualization, such as 
Spotfire, Tableau, ILOG and Hive Group. 
Many of these commercial success stories 
have been acquired by large business-intel- 
ligence or software companies. Despite the 
wide impact of these programmes in drug 
discovery, genomic data analysis and social- 
network analysis, they unfortunately get little 
mention in the Atlas of Science. 

These tools also support discovery by inte- 
grating rich data manipulation and statisti- 
cal analyses. Data-sharing platforms such as 
ManyEyes or Swivel encourage discussion 
around visualizations, and US government 
sites such as data.gov and recovery.gov raise 
expectations of open data and cultivate policy- 
oriented communities. The growing interest 
in ‘big data has spread from the pure sciences 
to the social sciences and humanities. Some 
journalists have also become innovators in 
presenting graphic data, providing readers 
with the same opportunity to explore infor- 
mation and make their own discoveries. 

In the Atlas of Science, Borner sets out 
the story of scientific map-making well. She 
shows a range of examples based on aspects 
of science: geographical maps, historical 
timelines, taxonomic hierarchies, citation 
networks and various forms of textual graph- 
ics. Readers will learn about the geographic 
concentrations of the creative class in Europe, 
North America and Japan; Wikipedia 
editing patterns; rising patent citations; and 
pathways to discoveries such as the structure 
of DNA. A recurring theme is the relative size 
and connectedness of disciplines, from the 
expected closeness of biology and ecology 
to the surprising linkage between computer 
science and social sciences. 

Borner is generous in giving credit to many 
scientific map-makers, but her choice is sub- 
jective and some readers will favour different 
heroes. The book mostly lacks critiques — only 
one visualization is challenged for its hard-to- 
read labels and partially obscured links. But 
other displays have advantages and drawbacks 
that merit debate. Borner and her contributors 
sometimes seem more entranced bya compel- 
ling visual than by its comprehensibility. 

In converting such displays to static 
paper, the Atlas of Science necessarily loses 
the interactive nature of information visu- 
alization. Seeing inspirational photos from 
Roman Vishniac’s microscope or the Hubble 
Space Telescope can only suggest the excite- 
ment of those who operate the controls. 
Nevertheless, Bérner’s magnificent book 
offers provocative new maps of science that 
will inspire fresh thinking. = 


Ben Shneiderman is professor of computer 
science at the University of Maryland, 
College Park, Maryland 20742, USA, and 
co-author of Analyzing Social Media 
Networks with NodeXL. 

e-mail: ben@cs.umd.edu 
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Science fit for a king 


Laura Spinney visits a Versailles exhibition of curiosities. 


given a male rhinoceros for the royal 
menagerie at Versailles, where it caused a 
sensation. The animal remained there for 22 
years, until it was killed by a sabre thrust dur- 
ing the French Revolution. Its skin was later 
stretched over an oak frame and displayed 
at the Natural History Museum in Paris. For 
the next few months it is back at Versailles, 
welcoming visitors to the exhibition Science 
and Curiosities at the Court of Versailles. 
The rhino embodies the exhibition's two 
main themes: science as spectacle, and 
science in the service of the state. King Louis 
XIV ushered in an era of frenetic scientific 
activity after his chief minister, Jean-Baptiste 
Colbert, persuaded him to establish a national 
academy of science in 1666 — as depicted in 
a 1680 painting by Henri Testelin (pictured). 
Louis XIV imposed no rules on his academi- 
cians, but he did pay them, so their projects 
had to be useful to him. Being invited to 
demonstrate your discovery or invention at 
court was the best way to get it known. 
Scientists came to Versailles from far 
and wide to help create a splendid royal 
residence. They included astronomer 
Giovanni Domenico Cassini, who directed 
the Paris Observatory from its opening in 
1671. Geometricians and astronomers laid 
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Science and 
Curiosities at the 
Court of Versailles 
The Palace of 
Versailles, France. 
Until 27 February 
2011. 


out the gardens (the 
instruments they 
used are on display), 
hydraulics experts 
pondered diverting 
the River Eure to fill 
the lakes — a project 
that was never completed — and explorers 
filled greenhouses and the menagerie with 
exotic species. The forerunner of the lift, 
the flying chair, was invented at Versailles to 
transport Louis XV’s mistresses upstairs; a 
full-size mock-up is shown. 

It wasn't all self-serving. Under Louis XIV, 
Cassini charted the Moon’s terrain, and Louis 
XV paid the Cassini family to create the first 
map of France, parts of which are on show, 
revealing detail down to the most isolated 
windmill. Louis XV also had a passion for 
scientific instruments, such as an astronomi- 
cal clock showing the Moon’s phases and 
movements of the planets around the Sun 
according to Nicolaus Copernicus. 

Louis XVI encouraged agricultural 
research with the hope of eradicating famine 
in France. He was rewarded in 1786 when 
Antoine- Augustin Parmentier presented him 
with the flowers of the potato, or ‘poor bread, 
whose cultivation he had been perfecting. 
Years before Edward Jenner came up with 
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A 1680 painting by Henri Testelin celebrates the achievements of the French national academy of sciences during the reign of Louis XIV. 
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his vaccine, thousands of people, including 
Louis XVI himself, benefited from a crude 
form of inoculation against smallpox — the 
disease that had killed Louis XV, according 
to his displayed medical certificate. 

Science was also entertainment at Ver- 
sailles: demonstrations drew large crowds. 
In 1746, in the Hall of Mirrors, eyewitnesses 
described how Abbot Nollet literally shocked 
140 hand-holding aristocrats with static elec- 
tricity. Full-sized battleships were floated on 
the royal lakes, and in 1783, brothers Joseph 
and Etienne Montgolfier demonstrated their 
hot-air balloon in a palace courtyard. 

Royal women also played a part in the 
scientific adventure. One of Louis XV’s 
mistresses, Madame de Pompadour, sup- 
ported the Encyclopédie of Denis Diderot 
and Jean le Rond d'Alembert, a volume of 
radical Enlightenment thinking that was 
banned by the French government. Visitors 
can see Marie-Antoinette’s dulcimer player, 
a primitive robot that hammers out tunes on 
a stringed instrument. The queen bought it 
in 1784 and, realizing its scientific value, 
donated it to the academy a year later. 

Marie-Antoinette, the rhino and the 
academy all met their ends in 1793 at the 
height of the Revolution. But the academy 
proved thicker-skinned than the queen and 
the ungulate. Realizing that they needed 
scientists after all, the revolutionaries 
recreated the academy in 1795 in its current 
form — as one of the five that make up the 
Institut de France. = 


Laura Spinney is a writer based in 
Lausanne, Switzerland. 
e-mail: Ifspinney@googlemail.com 
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Q&A Roger Penrose 
Impossible thoughts 


As he publishes his collected works — six volumes comprising more than 5,000 pages — the 
mathematical physicist muses on 50 years of groundbreaking research in general relativity, 
quantum mechanics, cosmology, geometry and consciousness. 


Your Collected Works includes a diverse 
range of papers. Is there a theme? 

Most of them involve a particular point 
of view on how to unify space-time struc- 
ture with quantum mechanics. I believe 
that quantum mechanics is not the whole 
story. On some scales, the rules of quantum 
mechanics have to be violated. There has to 
be some other ingredient that, I suspect, has 
to do with gravity. 


You are currently working on a book called 
Fashion, Faith and Fantasy. What it is about? 
Irashly suggested that title for three lectures 
I gave at Princeton University in 2003. “Fash- 
ion refers mainly to string theory, which has 
many merits but is not believable. I don’t 
see how you can make sense of all those 
extra dimensions. ‘Faith’ refers to quantum 
mechanics. It's a wonderful theory and works 
beautifully, but is self-inconsistent — in my 
view, when you make a measurement, you 
violate the Schrédinger equation. At some 
scale in the Universe, quantum mechanics 
will have to be replaced by a better theory. 


And ‘fantasy’? 

That’s largely directed at cosmic inflation, 
in which the Universe is supposed to have 
expanded by an enormous factor just after 
the Big Bang. I’ve always been against this 
— it can only work if you start offin avery 
special state. In my recent book Cycles of 


Time, I propose my Roger Penrose: 
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one stage in a succes- 

sion. What we think of as the Big Bang is not 
the beginning. It’s the continuation of the 
remote future of a previous aeon. 


How might we know if that is true? 

The cosmic microwave background — the 
radiation left over from the Big Bang — would 
reveal evidence of events taking place in the 
aeon before ours, mainly encounters between 
supermassive black holes. When galaxies 
collide, their central black holes may spiral 
around and swallow each other up, causing 
an enormous burst of gravitational radiation. 
Such a burst from late in the previous aeon 
would leave its mark as circles around which 
the temperature is anomalously uniform. My 
colleague Vahe Gurzadyan sees tentative signs 
of them [see go.nature.com/Lbwiou]. 


What does mathematics have to say about 

consciousness? 

In my 1989 book The Emperor’s New Mind, I 

said that computers will not achieve any con- 
scious understanding. 
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so understanding is not a computational 
process. Something else is going on. I have 
reason to believe it may involve the limits 
of quantum mechanics. Microtubules [tiny 
structures in cells] are the best candidate in the 
brain for where this might happen, as they are 
so small, but quantum mechanics would have 
to work on a huge scale to operate there. 


How was your work on impossible objects 
taken up by the artist M. C. Escher? 

When I was a graduate student at the Univer- 
sity of Cambridge, I went to a mathematics 
conference in Amsterdam. One lecturer had 
a strange picture of birds, Escher’s Day and 
Night. I decided to try something new and 
produced the tribar, an impossible triangle. 
My father also produced an impossible stair- 
case, which goes round and round. We wrote 
a paper and sent a copy to Escher, crediting 
him. He developed the ideas into two prints: 
Ascending and Descending, with monks 
going around an impossible staircase; and 
Waterfall, which incorporates the tribar. 


Did you meet him? 

I visited Escher once. I had some angular 
wooden tiles, all the same shape, which I gave 
to him to see whether he could cover a plane 
without any gaps or overlaps. One of the last 
pictures he produced shows the arrangement 
using ghost shapes. Later, after both Escher 
and my father died, I produced the first never- 
repeating pair of tile shapes [Penrose tiles]. It 
was a shame they didn't live longer because 
I'm sure Escher would have done something 
wonderful with them, and my father would 
have got a great kick out of it. 


Was your father a major influence? 

Yes. He was a human-genetics professor 
at University College London and studied 
the inheritance of mental disease. He was 
interested in the question of consciousness, 
too. Hed cut things out of wood and make 
puzzles for children, and was interested in 
games, chess in particular. I took no inter- 
est in chess myself, but my younger brother 
became British champion ten times. My 
older brother went on to become a highly 
respected mathematical physicist. To my 
father, science was very much like a game. 


Yet you were a slow learner as a child? 

At school in Ontario during the Second 
World War, I once got moved down a class 
because I couldn't do mental arithmetic. I 
was slower than the others. Then one teacher 
said, “You can have as long as you like to do 
the tests.” Given time, I did extremely well. 
That's always been true of me. It takes me a 
long time to think things through. Luckily I 
can get away with little sleep. I compensate 
by working into the night. m 


INTERVIEW BY JASCHA HOFFMAN 


23/30 DECEMBER 2010 | VOL 468 | NATURE | 1039 


© 2010 Macmillan Publishers Limited. All rights reserved 


ORRESPONDENCE 


No crisis in supply 
of peer reviewers 


At the journal Molecular Ecology, 
we find little evidence for the 
common belief that the peer- 
review system is overburdened by 
the rising tide of submissions. 

We analysed the number of 
requests required in 2001-10 
to obtain a review; compared 
the number of submissions in 
2001-07 with the number of 
unique reviewer names in each 
year; and calculated the mean 
number of reviews per reviewer 
in 2001-07 (see go.nature.com/ 
68mh16). 

The idea that it is now harder to 
find reviewers turns out to be true 
(the mean number of reviewing 
requests issued per review 
increased from 1.38 (s.e.= 0.02) in 
2001 to 2.03 (s.e.=0.05) in 2010). 
However, this seems to be due to 
changes in technology rather than 
to changes in reviewers attitudes: 
the declining acceptance rate 
by invited reviewers strongly 
correlates with the 2008 transition 
from an e-mail-based editorial 
system to an automated one, 
perhaps because spam filters 
blocked e-mail invitations. 

We also found that the reviewer 
pool expanded in proportion to 
the increased submission rate 
(which doubled between 2001 
and 2007), yet there was no 
increase in the average number of 
reviews by individual reviewers. 

The authors of the additional 
papers are the most likely source 
of the extra reviewers. Each 
Molecular Ecology submission 
has an average of 4.5 authors 
and decisions are based on an 
average of 2.7 reviews, so only 
0.6 reviews per co-author are 
required to compensate for 
the review burden of each new 
article. These figures indicate that 
the reviewer pool still seems able 
to accommodate the increasing 
number of submissions. 

Tim Vines, Loren Rieseberg 
Molecular Ecology, University of 
British Columbia, Canada. 


managing.editor@molecol.com 
Harry Smith University of 
Leicester, UK. 
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Brazil’s renewable 
energy success 


Brazil’s advanced energy matrix 
is starting to pay off: 47.3% 

of its primary energy is now 
renewable. The world average is 
still around 13%. 

Last year, Brazil produced 
244 million TOE (tonnes of oil 
equivalent), of which 42.6% 
came from oil and coal, and the 
rest from sugar cane (18.2%), 
hydropower (15.2%), biomass 
(13.9%), natural gas (8.7%) and 
uranium (1.4%). 

Ethanol accounted for 18.8% 
of fuel usage, and natural gas and 
biodiesel for 3.3%. In just 2 years, 
Brazil has reached its target of 
5% biodiesel additive in diesel. 
Ethanol is set to overtake petrol 
as fuel, thanks to flexible-fuel 
engines that use both at the same 
time. These account for 90% of 
small-car sales in the past 2 years. 

The country is developing the 
technology for ‘gree’ petrol and 
diesel production from sugar cane 
and agricultural waste, and from 
the castor-oil residue generated 
during biodiesel manufacture. 

Brazil's government estimates 
that only about 2.5% of arable 
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land will be needed to meet the 
ethanol demand forecast for 2017 
(today this is 1.4%). Burning 

of sugar-cane pulp (bagasse) is 
expected to supply 15% of Brazil’s 
electricity by 2017, comparable 

to that being generated by the 
Itaipu hydropower plant on the 
Brazil-Paraguay border. 

Allan Kardec Duailibe National 
Agency of Petroleum, Natural Gas 
and Biofuels, Brazil. 
allan@anp.gov.br 


Economic growth:a 
gross measure 


Gross domestic product (GDP) 
is as poor a measure of the 
economy as it is of welfare 
(Nature 468, 370-371; 2010). 

Quantifying the concept of ‘the 
economy’ is contentious because 
of arbitrary decisions as to what 
to include, and because of a drift 
when indexing to constant prices. 
For example, should the jump 
in price from Walkman to iPod 
be classed as inflation or as a 10” 
increase in storage productivity? 

And whereas economic 
growth theory uses a production 
value that is net of depreciation, 
GDP is a gross measure. Thus 
GDP looks good even when 
things are falling apart. Being 
the fastest-growing economy in 
the 2000s was actually a sign of 
economic distress, not success, 
for the United Kingdom. 

The economy, as a complex 


system, cannot logically be 
indexed bya single figure. In 
truth, GDP just reflects the 
perspective of the tax base, 
because that is how the figures are 
collected and presumably why the 
UK Treasury is keen to use it. 
Force-feeding an economy’s 
GDP index usually empties 
its environmental capital first, 
then its social capital and then 
whatever cash is left in the bank. 
David Fisk Imperial College 
London, UK. 
d.fisk@imperial.ac.uk 


Misreporting: hippo 
stories off-target 


Prevention is better than cure 
when it comes to the weight 

of ill-informed public opinion 
resulting from the misreporting 
of science by the media (Nature 
468, 7; 2010). 

Take the story ofa 
hippopotamus coming round 
unexpectedly in an African 
national park after incomplete 
delivery ofan immobilizer 
drug cocktail by a faulty dart. 
The animal had to be shot 
after attacking the attending 
researchers. 

News of the killing spread 
rapidly (see go.nature.com/ 
upkcu7), the story becoming 
more sensational with each 
rewriting (see go.nature.com/ 
rhhx7c). It prompted a public 
outcry and led some people to 
question why the research was 
being performed in the first place. 

Not reported was that the 
new drug cocktail had until 
then been used with 100% 
success on more than 20 hippos, 
and that previously trialled 
immobilization drugs had killed 
a quarter of the hippos tested. 
The real news is that this cocktail 
represents a breakthrough in the 
management and conservation of 
the species. 

P. J. N. de Bruyn University of 
Pretoria, South Africa. 
pjndebruyn@zoology.up.ac.za 
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OBITUARY 


Brian Marsden 


(1937-2010) 


The walking encyclopedia of comets. 


rian Geoffrey Marsden was the 
B ‘go-to man for comets — icy bodies 

that release gas or dust as they travel 
around the Sun — as well as for the thou- 
sands of named asteroids or ‘flying rocks. 

For more than three decades, Marsden, 
who died aged 73 on 18 November, headed 
an effort to locate objects that had once been 
observed and named, but that could no longer 
be tracked because the original observations 
had been insufficiently precise. His favour- 
ite recovery was the comet Swift-Tuttle, first 
sighted in 1862 but lost a couple of years later. 
Conventional wisdom held that it would 
return around 1981, but Marsden suspected 
that the 1862 comet had the same properties 
as one seen in 1737, and this allowed him to 
predict, correctly, that Swift-Tuttle would not 
return until late in 1992. 

Born in Cambridge, UK, Marsden was 
developing primitive ways to calculate the 
positions of planets by the age of 11. Asa 
teenager, he began to compute the locations 
of comets using logarithm tables. By the time 
he received his undergraduate degree from 
New College, Oxford, UK, he was widely 
known for being able to calculate comet 
orbits accurately. 

Marsden enrolled as a graduate at Yale 
University in New Haven, Connecticut, in 
1959, and soon programmed the university's 
IBM 650 computer to calculate comet orbits. 
In 1965, Fred Whipple invited him to join 
his staff at the Smithsonian Astrophysical 
Observatory in Cambridge, Massachusetts. 
Whipple, then the director of the observatory, 
had recently proposed the ‘dirty snowball’ 
model — the idea that comets consist mostly 
of ice mixed with dust. The computer pro- 
grams that Marsden developed to model the 
orbiting paths predicted by Whipple's theory 
are still widely used by astronomers. 


TAKING THE REINS 
As the director of the Central Bureau for 
Astronomical Telegrams (CBAT), [had trans- 
ferred it from Copenhagen to Cambridge, 
Massachusetts, shortly before Marsden 
arrived at the Smithsonian. The CBAT has, 
since 1920, been responsible for informing 
the world’s astronomers — on the behalf of 
the International Astronomical Union (IAU) 
— about comets and other objects of astro- 
nomical interest that change rapidly. It is also 
in charge of naming comets. 

On the day before Marsden was officially 
to take up his staff position at the observatory, 


we had a press conference to inform the public 
that the remarkable comet Ikeya—Seki would 
become bright enough to be seen near the 
Sun in broad daylight. Fortunately, Marsden 
joined us, because there were questions 
about historical comets that only he could 
answer. In the days that followed, it quickly 
became clear that he would prove an indis- 
pensable member of the CBAT team, and by 
the next [AU Congress, in 1968, I was more 
than happy to hand over the reins to him. 
Marsden was director for 32 years. This 
was an onerous assignment, because he 
had to be on duty 24 hours a day, 7 days a 
week, in case a brilliant supernova burst into 
view. Most astronomers little appreciated 
the service that Marsden rendered during 
those decades, although in 1989 he did win 
the American Astronomical Society's George 
Van Biesbroeck Prize ‘for service to astron- 
omy; and later he won a similar prize from 
the UK Royal Astronomical Society. 
Besides directing the CBAT, Marsden took 
over another [AU bureau, the Minor Planet 
Center, in 1978. Under his leadership, all the 
asteroids that had been lost were located again. 
There are now about half a million asteroids 
with known orbits, more than 16,000 of which 
have official names. In his role in various [AU 
committees, Marsden became, in effect, the 
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chief namer of asteroids, although the full 
committee had to ratify his suggestions. 

Because the procedures used to measure 
the positions of Pluto were the same as those 
used for minor planets, Marsden proposed, 
in 1999, to give it the minor planet number 
10,000. He considered Pluto to be the first of 
the trans-Neptunian objects — icy objects 
that orbit the Sun but that are farther from 
it than Neptune. 

His proposal stirred up an inordinate 
controversy among the public, and the IAU 
executive committee forbade such a move. 
In 1993, after three more trans-Neptunian 
objects were found, Marsden had been the 
first to suggest that they were all like Pluto 
in orbiting the Sun twice in the time it takes 
Neptune to orbit it three times. 

Marsden’s wish to ‘demote’ Pluto was 
granted only after trans-Neptunian objects 
more comparable to Pluto in size were dis- 
covered in 2005. At its triennial meeting in 
2006, the IAU voted to designate these objects 
members of a new class of ‘dwarf planet’ — 
which, paradoxically, are not considered sim- 
ply another kind of planet. At this same IAU 
meeting, Marsden stepped down as director 
of the Minor Planet Center. He was “quite 
entertained by the thought that both he and 
Pluto had been retired on the same day”. 

Brian rarely took breaks from calculat- 
ing the orbits of astronomical objects, and 
would typically be at his desk on a Saturday 
afternoon. A few months ago he was diag- 
nosed with leukaemia. In spite of his illness, 
he continued to come to the observatory. I 
saw him frequently, his office being directly 
across the hall from my own. 

I’ve recently been working on Galileo's 
discovery of the satellites of Jupiter (whose 
orbits had been the topic of Brian's doctoral 
thesis). I told him that I needed a diagram 
showing what the satellite orbits would have 
looked like in January of 1610. He paused for 
a few minutes: “Try the 1941 Nautical Alma- 
nac,”’ he suggested. The match is amazing. 
Three weeks later, pneumonia took its toll 
on a weakened immune system. The magic 
will be missed. m 


Owen Gingerich is professor emeritus 

of astronomy and history of science at 

the Harvard-Smithsonian Center for 
Astrophysics, Cambridge, Massachusetts 
02138, USA, and was a colleague of Brian 
Marsden’s for 45 years. 

e-mail: ginger@cfa.harvard.edu 
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NEWS & VIEWS 


Shadows of early migrations 


Analysis of ancient nuclear DNA, recovered from 40 ,000-year-old remains in the Denisova Cave, Siberia, hints at the 
multifaceted interaction of human populations following their migration out of Africa. SEE ARTICLE P.1053 


CARLOS D. BUSTAMANTE & BRENNA M. HENN 


he new discipline of palaeogenetics is 
delivering increasing dividends, the 
latest news coming from Reich, Paabo 
and colleagues on page 1053 of this issue’. The 
authors’ analysis of nuclear DNA of a human- 
like finger bone, found in Denisova Cave in 
southern Siberia, points towards a complex 
model of migration and colonization after 
anatomically modern humans moved out of 
Africa some 50,000-60,000 years ago. 

Ever since 1925, when Raymond Dart’s 
report of the first Australopithecus skull in 
southern Africa upended Victorian views of 
human origins, there has been debate over 
whether our species arose only once and 
spread throughout the world, replacing all 
extant species of Homo, or whether our ances- 
tors interbred with the other populations and 
subspecies. The most extreme version of the 
‘candelabra model of human origins — accord- 
ing to which human species arose multiple times 
independently of our Homo ergaster ancestors 
— has been largely discounted. But it has been 
difficult to assess more nuanced models, such 
as the possibility of genetic exchange with some 
archaic populations, including Neanderthals, 
and now perhaps ancient Siberians. 

Until recently, genetic data and interpre- 
tation of the fossil record seemed to favour 
a complete-replacement model, in which all 
human species trace all of their genetic ances- 
try to a single origin in one or more African 
populations of moderate size some 200,000 
years ago’ °. However, the Denisovan nuclear 
genome sequence’, along with that of Homo 
neanderthalensis published by some of the 
same authors’, suggest that the out-of-Africa 
population history of Homo sapiens is prob- 
ably much more intertwined than previously 
thought, with more intertwining in some parts 
of the world than others. 

On the basis of their analyses of ancient 
DNA from the Neanderthals and Denisovans, 
the Reich—Paabo team proposes that limited 
gene flow from archaic Homo species to mod- 
ern humans occurred in two brief episodes 
(Fig. 1). One episode occurred shortly after a 
subset of modern humans left Africa, and the 
second occurred only in the ancestors of Mela- 
nesian populations in Oceania. Their inference 
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Figure 1 | New hypotheses extend the ‘standard model’ of modern human history. Triangles 
and circles respectively represent sampling locations" of Neanderthal remains and of present-day 


human genomes. The blue arrows indicate generally accepted major migrations of anatomically modern 
humans", following their departure from Africa 50,000-60,000 years ago. At this time, there were two 
primary archaic species in Eurasia, Neanderthals and Homo erectus; Reich, Paabo and co-workers’ 
suggest that a third group was also present, represented by the ancient Denisovan genome. From ancient 
DNA"*, they identify additional putative events involving two episodes of limited gene flow: first, genetic 
admixture from Neanderthals to modern humans, shortly after the exit from Africa; second, subsequent 


admixture with the archaic population exemplified by the nuclear DNA extracted from the Denisova 
finger bone. This second event seems to affect only the ancestors of present-day Melanesians, who are 
thought to have colonized Papua New Guinea some 45,000 years ago. African populations, both past and 
present, are genetically highly diverse, as indicated by the multiple labels. 


of genetic admixture does not resurrect ortho- 
dox multiregional evolution, which theorizes 
extensive gene flow among Homo species 
across different geographical regions for hun- 
dreds of thousands of years’. Nailing specific 
details of a ‘replacement plus limited gene flow 
model will require much more work. But the 
broad outlines from sequencing ancient DNA 
provide a fascinating view of our genome, and 
present a hypothesis that can be tested when 
many, more diverse, human genomes (and, one 
hopes, more ancient ones) are available. 

The new work’ is a follow-up to an earlier 
paper’, by a group led by Paabo, on the deeply 
diverged mitochondrial DNA (mtDNA) 
genome recovered from the same finger 
fragment. Reich, Paabo and colleagues’ have 
now sequenced the bone’s nuclear genome to 
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approximately 2x coverage — that is, on aver- 
age, they have obtained sequence from two 
ancient DNA fragments that cover a given 
base in the genome. They compare these frag- 
ments with low (1-5x) coverage data from 12 
modern-human genomes, as well as with the 
Neanderthal genome’ sequenced to 1.5x. 
Nuclear DNA comes from the 22 pairs of 
autosomal chromosomes and the sex (X, Y) 
chromosomes. Apart from containing the 
vast majority of genetic information, nuclear 
DNA is well suited for analysis of gene flow 
because genetic recombination provides 
tens of thousands of semi-independent data 
points for comparing genetic relationships 
among present-day and ancient samples. The 
fragments of ancient DNA illuminate our 
understanding of human origins and, like the 


shadows in Plato's proverbial cave, give us the 
broad outlines of ancient human migrations. 

And what an interesting story they tell! It 
seems that the Denisovan was most similar 
genetically to Neanderthals, but not so similar 
as to have been sampled from the same popula- 
tion. The Reich-Paabo team now demonstrates 
that Denisovans and Neanderthals are sister 
taxa, clustering, on average, slightly more often 
than either does with modern human samples. 
Compared with modern humans, the Denis- 
ovan sample clusters slightly more often (about 
1-3% of time) with the present-day European 
or east Asian genomes as compared with the 
African genomes from the Yoruba, Mbuti and 
San. This is consistent with reported gene flow 
from a Neanderthal population into the ances- 
tors of modern-day Eurasians’, if Denisovans 
and Neanderthals are close sister taxa. 

What is particularly fascinating, how- 
ever, is that the Denisovan sample seems 
to share an extra genetic affinity (beyond 
that for European and Asian genomes) with 
present-day island Melanesians. This is rather 
unexpected, as the earliest occupation of Papua 
New Guinea, an island in Oceania, by mod- 
ern humans occurred only about 45,000 years 
ago”, and suggests quite a complicated 
picture for the ancestry of the Denisovan finger 
fragment. 

Studying ancient molecular diversity is not 
without its pitfalls — the molecular shadows 
we perceive may well have a more complex 
underpinning. In their Supplementary Infor- 
mation, Reich, Paabo and co-workers! go into 
exquisite detail to discount many potential 
sources of bias in their data, including con- 
tamination, handling of the ancient material 
and differences in depth-of-coverage among 
genomes. 

Many of these problems can indeed be dis- 
counted, but some technical hurdles remain. 
Sequencing technology and DNA preservation 
may affect the interpretation of the clustering 
statistic for ancient genomes — for example, 
the finding that greater numbers of derived 
alleles (gene variants) are shared between 
Eurasians and Neanderthals than between 
Eurasians and Denisovans could be due to dif- 
ferences in sequencing technology. Nonethe- 
less, it seems that comparison of ancient and 
modern genomes processed at the same time 
provides a consistent picture of extra allele- 
sharing between Denisovans and present-day 
Melanesians, as well as between Denisovans 
and Neanderthals. 

Perhaps the most powerful use of ancient 
DNA sequencing technology is in the realm 
of hypothesis generation. For example, from 
the Denisovan remains, one can make explicit 
predictions about the patterns of genetic vari- 
ation in modern humans who are yet to have 
their DNA sequenced. Specifically, if there is 
5-7% extra allele-sharing in the genomes of 
Melanesians with an archaic Homo population, 
by sequencing modern individuals from the 


region, every so often we should find oddly 
divergent regions of the genome in some 
Melanesian individuals. The same idea has 
been proposed to test Neanderthal admixture 
models (namely, looking for regions of the 
human genome in which the highly divergent 
fragments of DNA sequence are non-African 
and potentially inherited from an ancient 
population). 

As this work’ illustrates, studies of human 
genomic variation need to expand beyond 
the realm of medical interest. The study of 
diverse human genomes (both ancient and 
present-day) is the most powerful tool avail- 
able for understanding our common human 
origins and history. The success of this 
research depends, of course, on proper com- 
munity and individual engagement of diverse 
peoples (including those from isolated human 
populations), who may possess the genomic 
history of ancient human migrations across 
the globe. Together with the palaeoanthropo- 
logical record, analyses of ancient and modern 
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DNA will help us to better understand our own 
creation myths, and illuminate the details of 
the molecular shadows in the cave. m= 
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Electrons spin 


in the field 


Nanowires are candidates for enabling the exchange of quantum information 
between light and matter. The rapid control of a single electron spin by solely 
electrical means brings this possibility closer. SEE LETTER 2.1084 


DAVID J. REILLY 


he quest to develop ways to store and 

manipulate quantum information in 

condensed-matter systems is establish- 
ing a tool kit for controlling the nanoworld — 
one that promises far-reaching technological 
innovation. One example is the idea of encod- 
ing data, both classical’ and quantum’, in the 
spin orientation ofa single electron (its intrin- 
sic magnetic moment). During the past five 
years, this vision has largely been realized*’, 
and researchers are now turning to other 
goals, such as high-speed control of the spin 
orientation and the suppression of ‘decoher- 
ence processes that lead to a loss of quantum 
information. Innovative methods in quantum 
control®” and new material systems are lead- 
ing the way in tackling this next generation of 
challenges. 

On page 1084 of this issue, Kouwenhoven 
and co-workers” report an experiment that 
exploits the unique material properties of 
an indium arsenide (InAs) semiconductor 
nanowire to rapidly control the quantum state 
of a single electron spin using only electric 
fields. Beyond just flipping the spin orientation 


of a single electron, the authors tailor the pre- 
cise timing of electric-field pulses to extend the 
spin coherence time (during which the infor- 
mation encoded in the quantum state of the 
spin is preserved). 

Controlling electron and nuclear spins 
is central to magnetic resonance technolo- 
gies such as magnetic resonance imaging. 
These technologies use radio- or microwave- 
frequency magnetic fields to manipulate 
some 10” spins in macroscopic volumes. 
On the nanometre scale, the application of 
spatially selective, oscillating magnetic fields 
is a formidable challenge, which makes 
controlling single spins difficult. Although 
proof-of-principle experiments have shown 
that nanometre-scale magnetic control is pos- 
sible’, the time it takes to rotate the orientation 
of the electron spin magnetically is long and 
does not allow for many rotations within a spin 
coherence time. This limitation inhibits the use 
of this technique for quantum information 
processing. 

Kouwenhoven and colleagues’ experiment” 
addresses this shortcoming by moving from 
magnetic to all-electric fields to achieve rapid 
control over the spin. Although an interaction 
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between an electron’s spin and an applied elec- 
tric field is forbidden, if it is strong enough a 
quantum interaction known as spin-orbit 
coupling provides a means of controlling spins 
using oscillating electric fields, and is at the 
heart of the new field of ‘spintronics. 

Special relativity requires that an electron 
moving through an electric field experiences 
an effective magnetic field that couples its 
spatial motion (orbit) to its spin. In the sim- 
plest picture, spin-orbit coupling is possible 
because, from the viewpoint of the electron, 
it is the electric field that is moving, and time- 
varying electric fields generate a magnetic field 
that splits the electron’s spin states in energy. 
The detailed picture of spin-orbit coupling 
has played a key part in the formulation of 
quantum mechanics. 

For semiconductors in a magnetic field, the 
spin-orbit interaction can be much stronger 
than in an atom, owing to the high electron 
velocities and strong electric-field gradients 
produced by nuclei in the semiconductor 
crystal lattice’. As is the case in Kouwenhoven 
and colleagues’ experiment, careful choice of 
material system and device geometry can lead 
to spin-orbit coupling that is so strong that the 
electron’s spatial state and its spin cannot be 
considered separately: they collectively forma 
quantum state that preserves the long-lived spin 
component while allowing for manipulation 
through electric fields'’*"*. 

The signature of spin-orbit control has pre- 
viously been identified in gallium arsenide 
(GaAs) semiconductor quantum devices”, 
but the strong coupling in the InAs nanowire 
devices allows both faster control and the 
potential for the exchange of quantum informa- 
tion between optical and solid-state electronic 
systems. Indeed, optoelectronic devices’*”, 
such as semiconductor LEDs (light-emitting 
diodes), have recently been demonstrated in 
nanowire architectures that are similar to the 
authors’ InAs nanowire, and the possibility of 
transferring the quantum state of a single spin 
to a single photon now seems viable. The crea- 
tion of such hybrid quantum systems is pivotal 
because they allow the unique advantages of 
different quantum platforms to be combined 
to open up new quantum technologies. The 
iPhone provides the perfect example of how 
the tight integration of optical, mechanical and 
electrical devices can have a significant tech- 
nological impact. In quantum mechanics, this 
kind of integration is not easy, owing, in part, 
to the nature of quantum measurement and the 
fragility of systems that manipulate quantum 
information. 

For Kouwenhoven and colleagues’ experi- 
ment”*, an important but perhaps unexpected 
result is that the spin coherence lifetime, meas- 
ured by a technique known as the Hahn echo 
pulse sequence, is significantly shorter than 
in GaAs. The authors’ hunch is that this may 
result from the larger nuclear spin moment 
of indium compared with gallium or arsenic, 


which couples uncontrollably to the electron 
spin. To what extent this short time presents 
a fundamental problem requires further 
research, but will undoubtedly drive fresh 
innovation in the science and engineering of 
quantum systems. m 
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Proteins in dynamic 


equilibrium 


Protein molecules in solution exist as an equilibrium of different conformations, 
but the sizes and shifts of these populations cannot be determined from static 
structures. A report now shows how they can be measured in solution. 


PAU BERNADO & MARTIN BLACKLEDGE 


echnologies for determining protein 

structure have contributed immensely 

to our understanding of molecular 
biology, providing us with three-dimensional 
models at atomic resolution to explain the 
molecular basis of physiologically important 
interactions between biochemically active 
molecules’. But as we emerge from a decade 
of massive investment in structural genomic 
projects, it is becoming increasingly clear that 
a complete description of biomolecular activity 
also requires an understanding of the nature 
and role of protein conformational dynamics. 
Reporting in the Proceedings of the National 
Academy of Sciences, Yang et al.” describe a 
method that could provide us with just such 
an understanding — a combination of com- 
putational simulations and experimental X-ray 
scattering data enables the observation of 
shifts in the equilibrium population of protein 
conformational states. 

Proteins must be able to move in order 
to function. Such motion can be on a small 
scale — involving atomic fluctuations around 
an average structure — or can involve large- 
scale reorganization of molecular machinery’. 
Experimental data for proteins normally rep- 
resent average values for the entire ensemble of 
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conformations, but structural determinations 
routinely represent a single, static structure. 
The dynamic trajectories of protein movement 
can be invoked, by trapping and observing 
active or inactive conformational states and 
deducing the pathway that connects them. 
But direct access to functionally important 
protein motions requires new experimental 
and analytical tools that can accurately map 
conformational equilibria. 

In recent years, structural biologists have 
risen to this challenge by developing tech- 
niques to describe dynamic systems in terms 
of ensembles of structures, thus providing 
information about the importance of molec- 
ular motion for biological function*”. For 
example, nuclear magnetic resonance (NMR) 
spectroscopy provides ensemble-averaged 
experimental parameters that describe the 
intrinsic conformational dynamics that 
control molecular recognition®. Changes in 
global orientations of protein domains, or 
in the shape and size of molecular assem- 
blies, are more difficult to characterize using 
NMR alone, but these can be determined 
using a method known as small-angle X-ray 
scattering (SAXS)’*. 

It is gradually becoming established that the 
most appropriate way to define proteins’ con- 
formational disorder is to explicitly identify the 


ensembles of conformations that coexist and 
rapidly interconvert in dynamic equilibrium. 
Because of the vast number of conformations 
that can potentially be adopted by flexible pro- 
teins, accurate identification of these ensembles 
presents an ill-defined ‘inverse problem’ — how 
can the ensembles be identified from acquired 
data? The solution requires the development 
of robust statistical approaches to determine 
the probability that any particular multi- 
conformational equilibrium will exist”. A true 
statistical mechanical description of an ensem- 
ble also requires a quantitative assessment of 
the weighting of each conformation in the 
Boltzmann probability distribution of confor- 
mations. Yang et al. elegantly address both of 
these considerations in their study’. 

The authors used SAXS to study a multi- 
domain tyrosine kinase enzyme known as 
Hck, which belongs to the Src family of kinases. 
Src kinases are thought to be involved in the 
signalling pathways that govern cell growth 
and proliferation, and are implicated in many 
human diseases, most notably cancer. The regu- 
lation of Src kinases is known to involve large- 
scale reorientation of the proteins’ domains. 

Activation of these enzymes has been pro- 
posed to be a two-step process. In the first 
step, two small domains (SH2 and SH3) form 
intramolecular interactions with the carboxy 
and amino termini ofa larger, catalytic domain 
to form a compact, inactive ‘assembled’ con- 
formation. In the second step, the release of 
the intramolecular interactions destabilizes the 
compact structure, causing the formation of 
amore open, ‘disassembled’ state (the active 
conformation). This model of regulation has 
been delineated from crystal structures of 
different Src proteins at the end points of the 
activation process’*"'. Crucially, however, the 
dynamic flux between these states was poorly 
understood — until Yang et al. published their 
report’. 

The authors studied Hck in solution, both 
in its free form and in complex with SH2- and 
SH3-binding peptides. First, they used coarse- 
grained (low resolution) molecular dynamics 
simulations to extensively explore and sample 
accessible conformations of the protein in a 
physically meaningful way. Next, they used 
a clustering analysis on the resulting data to 
obtain a set of sub-states for the protein, which 
they used to interpret their experimentally 
obtained SAXS curves. 

A common problem with statistical analyses 
is over-fitting, which occurs when a statistical 
model describes noise, rather than the desired 
underlying relationship. Yang et al. intelli- 
gently avoided over-fitting by evoking only 
the minimum number of states that could be 
distinguished from their SAXS data. In addi- 
tion, and equally importantly, the authors used 
a Bayesian statistical analysis of these states to 
accurately determine their fractional popula- 
tions under different experimental conditions 
that change the conformational equilibrium. 


a_ Free protein 


Assembled 


b_ Peptide-bound protein 


Partly disassembled 
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Disassembled 


Partly disassembled 


Disassembled 


Figure 1 | Conformational states of the Hck enzyme in solution. The multidomain enzyme Hck can 
adopt several conformational states in solution, ranging from a compact ‘assembled’ state to partially 
assembled and disassembled states. Different domains are shown in different colours. a, Yang et al? used 
a combination of molecular dynamics simulations with small-angle X-ray scattering (SAXS) data to 
show that, in solution, free molecules of Hck divide into different populations of these states, existing in a 
dynamic equilibrium with each other. The percentages indicate the fraction of the molecular population 
that exists in a particular state. b, The authors also charted major population shifts in response to the 
binding of peptides (not shown) to the SH2 and SH3 domains. The 5% of the population unaccounted for 
in the figure is divided between several other conformational states. (Figure adapted from ref. 2.) 


Yang et al. demonstrated that several assem- 
bly states in equilibrium — not just two —must 
be considered to properly understand the 
conformational landscape that is crucial to 
the regulation of Hck (Fig. 1). The authors 
found that the enzyme is predominantly in 
the inactive, assembled conformation (82% 
of enzyme molecules), but is in dynamic equi- 
librium with partially and fully disassembled 
states. The assembled conformation pre- 
dominates even in the absence of a phosphate 
group on the carboxy terminus of the catalytic 
domain. This is notable because phosphory- 
lation of the carboxy terminus was thought 
to anchor Src enzymes in the assembled state, 
with dephosphorylation triggering disassembly 
to the active state. 

Yang and colleagues also observed that the 
population equilibrium among the various 
states responds to the presence of signalling 
peptides that, on binding to the SH2 or SH3 
domains, break specific intramolecular inter- 
actions in the enzyme. Taken together, their 
results demonstrate the link between the 
regulation of Hck and the complexity of its con- 
formational-energy landscape, and exemplify 


the inability of single structural images to fully 
describe such an intricate molecular process. 

The development of quantitative approaches 
for characterizing highly fluctuating conforma- 
tional equilibria on the basis of experimental 
data measured in solution is essential if we are 
to develop true statistical mechanical images 
of the potential-energy landscapes intrinsic to 
dynamic biomolecular systems. It is becoming 
clear that structural biology is experiencing a 
paradigm shift, with the realization that excited 
or partially populated states are crucial to bio- 
logical function”, and that the determination 
of single structures from ensemble-averaged 
experimental data can miss vital conforma- 
tional fluctuations or population changes that 
may be essential for biological activity. Ensem- 
ble approaches to the interpretation of SAXS 
and NMR data will inevitably reveal further 
secrets of the role of intrinsic conformational 
dynamics in protein function, as structural 
biology continues its inexorable shift towards 
a richer and more dynamic equilibrium. = 
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A giant surprise 


The discovery of an inner giant planet in the unusually massive solar system 
around the star HR 8799 creates an ensemble of planets that is difficult to explain 
with prevailing theories of planet formation. SEE LETTER P.1080 


LAIRD CLOSE 


he solar system around the star HR 8799 

should not exist. This system is unlike 

any other known: it is a massive system 
that has multiple massive planets, with each 
giant planet containing many times the mass of 
all the planets in our Solar System combined. 
However, on page 1080 of this issue, Marois 
and collaborators’ present new 
images of HR 8799 in which yet 
another equally massive planet is 
visible*. 

Previous work’ had imaged 
three planets around HR 8799, 
and now we have the surprise 
discovery of a fourth, HR 8799e, 
an inner, massive planet (about 
10 Jupiter masses) located some 
14.5 astronomical units from the 
star (1 Au is the average distance 
from Earth to the Sun). One 
might question the importance of 
the discovery of another extraso- 
lar planet when more than 500 are 
known. But the HR 8799 system 
is the only solar system known to 
have multiple outer planets (the 
other three planets, HR 8799b, 
HR 8799c and HR 87994, orbit 
respectively at approximately 68, 
38 and 24 au from the host star, 
and have estimated masses of 
about 7, 10 and 10 Jupiters). 

As HR 8799 is the only known 
example of a wide (greater than 
25 AU) solar system with multiple 
giant planets, astronomers were 
curious to know whether the 
star’s planets could have formed 
by gravitational collapse’ — one 


*This article and the paper under 
discussion! were published online 
on 8 December 2010. 


of the most popular theories of outer-planet 
formation. This theory posits that outer giant 
planets form from the fragmentation of the 
disk of gas and dust that develops around stars 
when they are young. In a process rather like 
the way binary stars form, a gravitational insta- 
bility in the disk fragments it and quickly (ona 
timescale of 10,000 years) leads to the forma- 
tion of gas-giant planets’. But the discovery of 


Cold disk, rapid 

planet formation 
by gravitational 

collapse. 


Warm disk, + 
slow planet 
formation by 
core acGretion 


Jupiter’s orbit 


around the Sun 


(for scale) 


Figure 1 | The HR 8799 planetary system. When star HR 8799 formed, 

a massive circumstellar disk of gas and dust probably existed from which the 
star’s four massive planets formed; the planets’ approximate current orbits are 
overlaid and labelled b-e. The outer part of the disk was very cold and rotated 
slowly, and so might have collapsed through gravitational instabilities to quickly 
form outer planets such as ‘b. The newly discovered ‘’ planet' is in a very 
different zone, where the disk was much warmer and the planet is likely to 

have formed in a slow, two-step ‘core-accretion’ process. Neither theory of 
planet formation — gravitational collapse or core accretion — can explain the 
whole family of four planets. 
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an inner planet such as HR 8799e at 14.5 au 
poses a tricky puzzle. At this distance, the disk 
was neither cold enough nor rotating slowly 
enough to fragment and undergo gravitational 
collapse in situ to form HR 8799e°. 

To explain the formation of this latest planet, 
Marois et al.' appeal to the dominant theory 
of giant-planet formation: a slower process 
than gravitational collapse (about 3.5 million 
years at a distance of 10 av) in which solid 
dust grains conglomerate into solid cores of 
tens of Earth masses and then gravitationally 
accrete disk gas to grow to Jupiter masses. 
Such a ‘core-accretion’ process itself is only 
marginally fast enough at 14.5 au to build up 
HR 8799e's roughly 10 Jupiter masses before 
the disk gas accretes onto the star in less than 
10 Myr. This formation timescale problem’* 
becomes even more vexing if one consid- 
ers that, at about 2.6 times the distance 
HR 8799e is from the host 
star, HR 8799c would require 
about 20 times longer (more 
than about 200 Myr) to grow 
to the same mass at 38 Au — 
long after the disk has lost all 
its gas. What’s more, at 68 AU, 
HR 8799b’s formation is truly 
problematic, requiring an even 
longer timescale (many times the 
age of the star) to have formed 
in situ by core accretion. Hence, 
neither of the two favoured the- 
ories of giant-planet formation 
can explain how all the plan- 
ets around HR 8799 formed: 
HR 8799e is too close to have 
formed by gravitational collapse, 
and HR 8799c and HR 8799b are 
too far out to have formed by core 
accretion (Fig. 1). 

Perhaps all of these massive 
planets formed at much larger dis- 
tances (more than at least 50 av) 
by the gravitational collapse ofan 
unusually massive disk and then 
migrated quickly inwards to 
their current positions, some- 
how sweeping into a dynami- 
cally stable set of 1:2:4 orbital 
resonances’ (where, for every 
one orbit of planet c, there are 
two of dand four of e). This does 
not really help the situation, 
however, because it is unlikely 


that such a massive planet as HR 8799e could 
have migrated from about 50 to 14.5 au by 
means of tidal torques from the residual gas 
that had not been used to build up the plan- 
ets. The converse theory, by which the planets 
all form through core accretion within about 
10 au and then slowly move outwards by scat- 
tering lesser objects (planetesimals) inwards, is 
also problematic because there is probably too 
limited a reservoir of planetesimals to move a 
7-Jupiter-mass object such as HR 8799b out- 
wards some 58 Au. So, despite having a clear 
view of the system — thanks to the power of 
adaptive-optics systems and large ground- 
based telescopes — we cannot currently explain 
how all four planets formed in a coherent, 
coeval fashion. 

A key strength of direct imaging is that pho- 
tons can be collected from these self-luminous 
young planets as they contract, allowing the 
planetary spectra to be observed (to calculate 
temperatures and luminosities). The observed 
brightness of HR 8799b in direct images is 
much lower than would be expected from 
its observed temperature, given that evolu- 
tionary models indicate that HR 8799b must 
have a radius larger than that of Jupiter’**. 
This ‘under-luminosity’ problem is typical of 
around half of the extrasolar planets imaged 
to date. One possible explanation is that dusty, 
thick, planetary-scale high-latitude cloud 
‘bands’ absorb/scatter light when viewing a 
young planet over its pole. For example, the 
‘under-luminous’ planets in the HR 8799 sys- 
tem are probably being viewed close to ‘pole- 
on’, perhaps leading to less light emitted in 
the direction of Earth. By contrast, ‘edge-or 
giant planets, such as B-Pictoris b°, look 
brighter because light streams freely from the 
brighter equatorial regions between the dark 
cloud bands. Clearly, further theoretical (and 
direct imaging) work will be needed to identify 
the ultimate cause of this under-luminosity 
problem. 

The future holds much promise for more 
surprises in the field of direct imaging of extra- 
solar planets. However, it seems unlikely that 
any other massive outer planets will be found 
around HR 8799”. There is always a chance, 
though, that low-mass terrestrial planets 
lie within the star’s 10-au-radius ‘asteroid’ 
belt. The next chapter in this story will soon 
be written by even more powerful ground- 
based, adaptive-optics imagers*” and, let us 
hope, by more powerful pathfinding, space- 
based planet- and disk-imaging telescopes”. 
These pathfinders should eventually lead to 
a terrestrial-planet-finding telescope even 
capable of taking spectra of Earth-like plan- 
ets. Such an achievement could address one 
of the most pivotal questions in science: how 
common are truly Earth-like planets and life in 
our Universe? = 
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A gene for impulsivity 


Impulsivity has been linked to various psychiatric disorders and forms of violent 
behaviour. A gene mutated in a population of violent Finnish criminal offenders 
provides clues to the neural basis of this trait. SEE ARTICLE P.1061 


JOHN R. KELSOE 


n old adage admonishes us to look 
Aw we leap. This bit of common 

sense reflects a crucial and complex 
brain function that regulates behaviour. To act 
without thinking — impulsivity — is to risk 
leaping off a cliff. But to excessively delay an 
action may lead to inaction or missed oppor- 
tunities. Now, using a powerful genomics 
approach, Bevilacqua and colleagues’ dissect 
elements of the neurotransmitter system in the 
brain that mediates impulsivity and show that 
the serotonin 2B receptor (HTR2B) has a role 
in severe impulsivity, at least in one human 
population (page 1061 of this issue). 

Impulsivity is generally thought to be a fail- 
ure of inhibitory function in the brain’. Clearly, 
fine-tuning of such inhibition is important 
for an organism to adapt to its environment. 
Impulsivity has been linked to a variety of 
behavioural and psychiatric syndromes, 
including attention deficit hyperactivity dis- 
order, mania, drug addiction and borderline 
personality disorder (BPD)*". It has also been 
associated with violent behaviour, as seen in 
antisocial personality disorder (ASPD) and 
intermittent explosive disorder (IED), and 
with suicide. Several neurotransmitters — 
serotonin and dopamine in particular — have 
been implicated in mediating impulsivity, but 
unravelling the underlying mechanisms has 
proved challenging. 

In their search for genes predisposing to 
impulsivity, Bevilacqua et al.’ used the well- 
studied ‘founder’ population of Finland. 
Because of the country’s relative isolation, 
the current Finnish population is believed to 
be largely derived from two waves of immi- 
gration 4,000 and 2,000 years ago’. It has 
therefore been argued that, compared with 
other, more-outbred populations, there may 
be fewer mutations for genetic traits in this 


population, and studies of various genetic 
disorders support this assumption. To fur- 
ther enhance their odds of success, Bevi- 
lacqua and co-workers focused on Finnish 
subjects with the most extreme manifestation 
of impulsivity — violent offenders whom the 
authors evaluated in a forensic psychiatric 
unit and who had strong lifetime histories of 
aggressive acts. 

Specifically, Bevilacqua et al. examined 
96 individuals for 14 candidate genes, using 
next-generation sequencing technology to 
identify possible disorder-causing mutations. 
They focused on the protein-coding regions 
(exons) of the genes as the regions in which 
mutations would most probably affect gene 
function. They found a variation at a single 
nucleotide base — dubbed HTR2B Q20* — in 
the HTR2B gene, which results in an errone- 
ous stop codon, a signal that ends further pro- 
tein extension. The researchers show that this 
mutation triggers a process called nonsense- 
mediated RNA decay, such that no HTR2B- 
receptor protein is expressed. 

The HTR2B Q20* mutation was present in 
the violent offenders at three times the rate of 
that in matched controls (psychiatrically nor- 
mal Finnish individuals). It was also inherited, 
along with psychiatric illnesses such as ASPD, 
IED and BPD, by members of their families. The 
17 violent offenders who carried HTR2B Q20* 
had all committed an average of five violent 
crimes, 94% of which were committed under 
the influence of alcohol. These crimes were 
largely aggressive reactions to minor events 
that lacked premeditation or financial gain 
as a goal. 

The authors’ results are consistent with 
many animal and human studies that impli- 
cate serotonin in impulsive behaviour®. Previ- 
ous studies have in general supported the idea 
that low serotonin levels are associated with 
impulsive action. For instance, activation 
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Figure 1 | HTR2B and the regulation of impulsivity. Bevilacqua et al.’ find that, ina Finnish 
subpopulation, a mutation in the serotonin receptor HTR2B is linked to severe impulsivity. In the 
nucleus accumbens region (green) of the brain, projections of neurons that secrete serotonin (red) 
interact with those that secrete dopamine (blue). This region has been repeatedly shown to play a 
crucial part in choice and impulsivity. Mutations in HTR2B, which modulates the release of dopamine 
and serotonin in the nucleus accumbens, may reduce the release of these neurotransmitters, leading 


to increased impulsive behaviour. 


of the 5-HT,, receptor, which may inhibit 
serotonin release, has been linked to impulsiv- 
ity in animal models*. Moreover, the levels of 
5-hydroxyindoleacetic acid — a metabolite of 
serotonin — are reduced in the cerebrospinal 
fluid of people who are suicidal’. Further- 
more, individuals whose serotonin levels have 
been experimentally lowered by diet are more 
likely to make impulsive choices'®. Nonethe- 
less, the role of serotonin is probably com- 
plex, not least because the serotonin system 
includes 14 different receptors with sometimes 
opposing actions. 

The HTR2B receptor received little attention 
in earlier studies of impulsivity. So, to support 
their human data, Bevilacqua et al.' examined 
mice that lack the Htr2b gene. They observed 
increased impulsive behaviour in these animals 
according to several measures. How exactly 
HTR2B deficiency leads to this effect remains 
unclear, although the authors find that both 
male mice lacking Htr2b and men carrying the 
HTR2B Q20* mutation have elevated levels of 
the hormone testosterone. 

Previous work’ suggests that HTR2B may 
function by modulating both serotonin and 
dopamine in the nucleus accumbens — a brain 
region involved in impulsive behaviour (Fig. 1). 
For instance, the ‘club drug’ ecstasy has been 
shown"' to stimulate the release of both sero- 
tonin and dopamine in the nucleus accumbens 
by directly activating HTR2B. It could there- 
fore be that depletion of the HTR2B receptor 
results in increased impulsive behaviour by 
reducing the release of both serotonin and 
dopamine in the nucleus accumbens. However, 
much more work is required to elucidate how 
HTR2B regulates impulsive behaviour through 


its modulation of the interaction between 
pathways involving serotonin and dopamine. 

Bevilacqua and colleagues’ observation’ that 
the HTR2B Q20* mutation is unique to Finns 
serves as yet another reminder of the high level 
of heterogeneity likely to be seen in complex 


DRUG DISCOVERY 


genetic traits and the importance of population 
history. But although this specific mutation is 
absent in non-Finnish populations, different 
mutations in the HTR2B gene might operate 
in other populations. 

Bevilacqua and colleagues’ paper also illus- 
trates the power of exon-based sequencing in 
founder populations, and suggests that exonic 
mutations of strong functional effect do playa 
part in complex behavioural traits. m 


John R. Kelsoe is in the Department 

of Psychiatry, University of California, 

San Diego, and the VA San Diego Healthcare 
System, La Jolla, California 92014, USA. 
e-mail: jkelsoe@ucsd.edu 


1. Bevilacqua, L. et a/. Nature 468, 1061-1066 
(2010). 

2. Cardinal, R. N. Neural Netw. 19, 1277-1301 (2006). 
3. Moeller, F. G., Barratt, E. S., Dougherty, D. M., 
Schmitz, J. M. & Swann, A. C. Am. J. Psychiatry 158, 
1783-1793 (2001). 

4. Swann, A. C., Lijffijt, M., Lane, S. D., Steinberg, J. L. & 
oeller, F. G. Bipolar Disord. 11, 280-288 (2009). 
5. Peltonen, L., Jalanko, A. & Varilo, T. Hum. Mol. Genet 
8, 1913-1923 (1999). 

6. Robbins, T. W. Psychopharmacology 163, 362-380 
(2002). 

7. Pattij, T. & Vanderschuren, L. J. M. J. Trends 
Pharmacol. Sci. 29, 192-199 (2008). 

8. Winstanley, C. A., Theobald, D. E., Dalley, J. W. & 
Robbins, T. W. Neuropsychopharmacology 30, 
669-682 (2005). 

9. Traskman, L., Asberg, M., Bertilsson, L. & Sjustrand, 
L. Arch. Gen. Psychiatry 38, 631-636 (1981). 
10.Rogers, R. D. et a/. Psychopharmacology 146, 
482-491 (1999). 

.Doly, S. et al. J. Neurosci. 28, 2933-2940 (2008). 


1 


ee 


Reader’s block 


Protein factors can regulate gene expression by binding to specifically modified 
DNA-associated proteins. Small molecules that selectively interfere with such 
interaction may be of therapeutic value. SEE ARTICLE P.1067 & LETTER P1119 


SEAN D. TAVERNA & PHILIP A. COLE 


rotein factors are crucial for control- 
Pp ling gene expression. One group of such 

factors affects gene activity by ‘reading’ 
epigenetic marks — reversible modifications 
such as the addition of phosphate, acetyl or 
methyl groups — on proteins after their trans- 
lation from RNA. The factors’ target proteins 
are histones, which associate with DNA to form 
chromatin. The reading ability of these protein 
factors is a result of specific, well-folded sub- 
domains, sometimes called readers, which can 
distinguish between the post-translationally 
modified state of their binding partner and its 
unmodified state. Two papers’” in this issue 
describe highly potent and selective inhibi- 
tor molecules that compete with acetylated 
histones for binding to a set of such readers. 
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These data have therapeutic implications. 

Post-translational modifications can 
often influence transient protein-protein 
interactions by creating or disrupting bind- 
ing surfaces on the molecules. Among such 
modifications, acetylation on lysine amino- 
acid residues has been centre stage: originally 
discovered’ more than 40 years ago as a regula- 
tor of chromatin structure, this modification 
has now been detected in thousands of other 
proteins’. Acetyl-lysine modifications facili- 
tate the interaction of the protein with proteins 
that contain bromodomains — evolutionarily 
conserved subdomains that can specifically 
bind, or read, the acetylated form of the lysine 
during regulatory processes’. Such interactions 
are thought to regulate transcription and to 
be involved in various diseases, including 
cancer. 


A range of acetyltransferase enzymes 
(writers) add acetyl groups to lysine residues, 
and two families of deacetylase enzymes 
(erasers) remove these groups®. Two recently 
approved anticancer drugs’ — SAHA and dep- 
sipeptide — work by blocking deacetylases, and 
have galvanized the pharmaceutical industry's 
interest in targeting chromatin modifications. 
In fact, several start-up biotech companies 
have attempted to target erasers and writers of 
lysine acetylation. In general, however, even 
highly specific inhibitors of acetyltransferases 
and deacetylases that mediate post-transla- 
tional modification can have undesired side 
effects, because blocking these enzymes can 
affect many different protein substrates and 
biochemical pathways. 

As for targeting protein-protein inter- 
actions, with several notable exceptions 
the use of small-molecule drugs has been 
considered extremely difficult® because their 
binding regions frequently consist of wide, 
shallow surfaces. For example, despite decades 
of work, pharmacologically practical com- 
pounds that disrupt the binding of phosphor- 
ylated proteins to their SH2-domain-containing 
protein partners have remained elusive. There 
have also been a couple of attempts to use small 
molecules to inhibit the interactions between 
proteins containing bromodomains and those 
carrying acetyl-lysines, but focus on this line of 
research has generally been limited’. 

Using very different approaches, Filippa- 
kopoulos et al. (page 1067) and Nicodeme 
et al. (page 1119) now converge on a closely 
related set of chemical scaffolds — the triazole- 
diazepine-fused ring compounds JQ1 and 
I-BET — that inhibit the acetyl-lysine-read- 
ing ability ofa specific class of bromodomain. 
Both sets of compounds bind tightly to bromo- 
domains in proteins of the BET family by 
exploiting the unusual pockets characteristic 
of this protein family (Fig. 1). 

The bromodomains of BET proteins show 
a strong preference for housing doubly modi- 
fied acetyl-lysine histone tails in their wide 
and highly structured hydrophobic pockets”. 
Because of their shape and electrical prop- 
erties, these pockets are also well suited for 
binding small molecules. Indeed, the present 
papers’ structural data’” confirm that JQ1 and 
I-BET fit snugly into the acetyl-lysine pockets 
in a stereospecific fashion. Thermodynamic 
measurements further establish that both of 
the bromodomain-inhibitor interactions are of 
high affinity (with dissociation constants below 
100 nM) and, compared with their interaction 
with other non-BET types of bromodomain, 
show great selectivity (at least 100-fold). 

The two teams also pursue distinct bio- 
medical applications for JQ-1 and I-BET. 
Filippakopoulos et al.' examine whether JQ] can 
antagonize the growth ofa rare but aggressive 
form of cancer called midline carcinoma. This 
cancer is defined by a gene fusion that results in 
the unnatural linkage of BRD4 — a BET protein 
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Figure 1 | Targeting the interaction between bromodomains and acetyl-lysine moieties. 

a, Filippakopoulos et al.' show that JQ1 — a small-molecule competitive inhibitor that blocks the 
interaction of bromodomains of BET proteins with acetylated lysines (Ac) — can inhibit the proliferation 
of tumour cells expressing the BRD4-NUT oncoprotein. b, Nicodeme et al.” show that pretreatment of 
cells with another small-molecule competitive inhibitor, I-BET, which interferes with the interaction 
between the bromodomain of the BET protein BRD4 and Ac, can mute the transcription of genes that are 


induced during inflammatory responses. 


containing two bromodomains — with another 
protein called NUT. The BRD4—NUT fusion 
protein mediates increased acetylation of certain 
chromatin domains that are normally transcrip- 
tionally inactive!!, and so it was predicted that 
inhibitors of the BRD4 bromodomains would 
shut down tumour growth mediated by this 
mechanism. Filippakopoulos and colleagues 
confirm this hypothesis, showing that JQ1 could 
blunt the growth of midline carcinoma cells in 
culture, as well as in mice into which the tumour 
cells were introduced. 

Nicodeme et al.” investigate whether I-BET 
modulates genes mediating immunological 
and inflammatory responses. They find that it 
inhibits the expression of a subset of genes nor- 
mally induced in response to toxic injury, with 
histone acetylation being reduced in the chro- 
matin regions around these genes. In a practi- 
cal application of these findings, the authors 
demonstrate that treating mice with I-BET 
protects against the excessive inflammatory 
response to septic shock. Such results point 
to the clinical potential of BET-bromodomain 
inhibitors in immuno-modulation therapies. 

The two papers’” provide credibility for 
the idea of extending the pharmacology of 
targeting chromatin modifications beyond 
enzymatic activities and into the challenging 
arena of protein-protein interactions. One 
appeal of this strategy is that it avoids the 
promiscuity of enzyme inhibitors. 

The studies further raise the prospect 
of identifying inhibitors of other readers, 
such as those that bind proteins containing 
methyl-lysine modifications. Nonetheless, 


antagonizing a reader of post-translational 
modifications might also prompt unknown 
and unwanted alterations in biological 
pathways — a complication that necessitates 
extensive follow-up studies before such agents 
can move into the clinic. Moreover, the possible 
uniqueness of BET-bromodomain structures 
makes it difficult to predict whether small- 
molecule inhibitors would be similarly effective 
in antagonizing other bromodomain forms and 
reader modules. Nevertheless, the new tools 
described by these studies will undoubtedly 
prove attractive to biologists interested in the 
dynamics of chromatin and gene expression 
in physiology and disease. = 
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Genetic history of an archaic hominin 
group from Denisova Cave in Siberia 
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Using DNA extracted from a finger bone found in Denisova Cave in southern Siberia, we have sequenced the genome of an 
archaic hominin to about 1.9-fold coverage. This individual is from a group that shares a common origin with 
Neanderthals. This population was not involved in the putative gene flow from Neanderthals into Eurasians; however, 
the data suggest that it contributed 4—6% ofits genetic material to the genomes of present-day Melanesians. We designate 
this hominin population ‘Denisovans’ and suggest that it may have been widespread in Asia during the Late Pleistocene 
epoch. A tooth found in Denisova Cave carries a mitochondrial genome highly similar to that of the finger bone. This tooth 
shares no derived morphological features with Neanderthals or modern humans, further indicating that Denisovans 
have an evolutionary history distinct from Neanderthals and modern humans. 


Less than 200,000 years ago, anatomically modern humans (that is, 
humans with skeletons similar to those of present-day humans) 
appeared in Africa. At that time, as well as later when modern humans 
appeared in Eurasia, other ‘archaic’ hominins were already present in 
Eurasia. In Europe and western Asia, hominins defined as Neanderthals 
on the basis of their skeletal morphology lived from at least 230,000 
years ago before disappearing from the fossil record about 30,000 years 
ago’. In eastern Asia, no consensus exists about which groups were 
present. For example, in China, some have emphasized morphological 
affinities between Neanderthals and the specimen of Maba’, or between 
Homo heidelbergensis and the Dali skull’. However, others classify these 
specimens as “early Homo sapiens”. In addition, until at least 17,000 
years ago, Homo floresiensis, a short-statured hominin that seems to 
represent an early divergence from the lineage leading to present-day 
humans*”, was present on the island of Flores in Indonesia and possibly 
elsewhere. 

DNA sequences retrieved from hominin remains offer an approach 
complementary to morphology for understanding hominin relation- 
ships. For Neanderthals, the nuclear genome was recently determined 
to about 1.3-fold coverage*. This revealed that Neanderthal DNA 
sequences and those of present-day humans share common ancestors 
on average about 800,000 years ago and that the population split of 
Neanderthal and modern human ancestors occurred 270,000- 
440,000 years ago. It also showed that Neanderthals shared more 
genetic variants with present-day humans in Eurasia than with pre- 
sent-day humans in sub-Saharan Africa, indicating that gene flow 
from Neanderthals into the ancestors of non-Africans occurred to 
an extent that 1-4% of the genomes of people outside Africa are 
derived from Neanderthals*. In addition, ten partial and six complete 


mitochondrial (mt)DNA sequences have been determined from 
Neanderthals’"’’. This has shown that all Neanderthals studied so 
far share a common mtDNA ancestor on the order of 100,000 years 
ago’’, and in turn, share a common ancestor with the mtDNAs of 
present-day humans about 500,000 years ago'”"*” (as expected, this is 
older than the Neanderthal-modern human population split time of 
270,000-440,000 years ago estimated from the nuclear genome’). One 
of these mtDNA sequences has also shown that hominins carrying 
mtDNAs typical of Neanderthals were present as far east as the Altai 
Mountains in southern Siberia’’. 

In 2008, the distal manual phalanx of a juvenile hominin was exca- 
vated at Denisova Cave. This site is located in the Altai Mountains in 
southern Siberia, and is a reference site for the Middle to Upper 
Palaeolithic of the region where systematic excavations over the past 
25 years have uncovered cultural layers indicating that human occu- 
pation at the site started up to 280,000 years ago*’. The phalanx was 
found in layer 11, which has been dated to 50,000 to 30,000 years ago. 
This layer contains microblades and body ornaments of polished 
stone typical of the “Upper Palaeolithic industry’ generally thought 
to be associated with modern humans, but also stone tools that are 
more characteristic of the earlier Middle Palaeolithic, such as side- 
scrapers and Levallois blanks**”*. 

Recently, we used a DNA capture approach” in combination with 
high-throughput sequencing to determine a complete mtDNA genome 
from the Denisova phalanx. Surprisingly, this mtDNA diverged from 
the common lineage leading to modern human and Neanderthal 
mtDNAs about one million years ago”, that is, about twice as far back 
in time as the divergence between Neanderthal and modern human 
mtDNAs. However, mtDNA is maternally inherited as a single unit 
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without recombination, and therefore is subject to chance events such 
as genetic drift, as well as gene flow and positive selection. In contrast, 
the nuclear genome comprises tens of thousands of unlinked, mostly 
neutrally evolving loci. This allows for analyses of genetic relationships 
that are robust to the stochasticity of genetic drift, and are much less 
affected by positive selection. To clarify the relationship of the Denisova 
individual to other hominin groups, we have therefore sequenced the 
Denisova nuclear genome and analysed its genomic relationships to 
Neanderthals and present-day humans. We have also attempted to 
clarify the chronology of hominin occupation of the cave and have 
identified a tooth from this group of hominins among material exca- 
vated in Denisova Cave. 


DNA sequence determination 


The entire internal portion of the phalanx sample was used for DNA 
extraction in our clean-room facility, where procedures to minimize 
contamination from present-day human DNA are rigorously imple- 
mented***> (Supplementary Information section 1). The DNA was 
treated with two enzymes: uracil-DNA-glycosylase, which removes 
uracil residues from DNA to leave abasic sites”°, and endonuclease 
VIL which cuts DNA at the 5’ and 3’ sides of abasic sites. Subsequent 
incubation with T4 polynucleotide kinase and T4 DNA polymerase 
was used to generate 5’-phosphorylated blunt ends that are amenable 
to adaptor ligation. Because the great majority of uracil residues occur 
close to the ends of ancient DNA molecules, this procedure leads to 
only a moderate reduction in average length of the molecules in the 
library, but a several-fold reduction in uracil-derived nucleotide 
misincorporation”’. 

Two independent sequencing libraries (SL3003 and SL3004) were 
created from the DNA, using a modified Illumina protocol”* where a 
polymerase chain reaction (PCR) is used to add a 7-nucleotide index 
(in this case 5’-GTCGACT-3’) to the library molecules. This index 
ensures that the libraries are not contaminated by other sequencing 
libraries when they are taken out of the clean room to be sequenced”. 
The libraries were sequenced on the Illumina Genome Analyser IIx 
platform for 101 cycles from each end of the molecules and an addi- 
tional 7 cycles for determination of the index until almost every 
unique sequence in the libraries had been seen multiple times, that 
is, almost every clone present in the libraries has been sequenced 
(Supplementary Information section 1). Bases were called using the 
machine-learning algorithm Ibis” and an overlap of at least 11 bases 
was required for paired-end reads to be fused to full-molecule-size 
DNA sequences that were further analysed. This results in a greatly 
reduced error rate”’, although it removes the few molecules that are 
above 191 nucleotides in length from analysis (~0.1% in SL3003 and 
~0.2% in SL3004). Sequences were mapped using the program 
BWA” to the human (hgl8/NCBI 36) and the chimpanzee 
(panTro2/CGSC 2.1) genomes as well as to the inferred ancestral 
genome of these species (from the six-way Enredo-Pecan-Ortheus 
alignment)**. PCR duplicates were identified and used to further 
increase sequence accuracy by calling consensus sequences. 

A total of 82,227,320 sequences mapped uniquely (mapping 
quality =30) to the human genome, yielding about 5.2 gigabases of 
DNA sequences (1.9-fold genomic coverage), and 72,304,848 sequences 
mapped uniquely to the chimpanzee genome. When the substitutions 
inferred to have occurred on the Denisova and the present-day human 
lineages were compared, the relative numbers of different classes of 
nucleotide substitutions are remarkably similar, and the excess number 
of candidate substitutions on the Denisova lineage relative to the 
present-day human lineage is only 1.7-fold (Supplementary Fig. 2.2 
and Supplementary Table 2.4). This reflects an improvement in error 
rate over the Neanderthal genome by over an order of magnitude® and is 
mainly due to the enzymatic removal of uracil residues from the 
Denisova DNA”. We estimate that most errors in the Denisova DNA 
sequences are due to low genomic coverage rather than to any features 
typical of ancient DNA. 
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Human DNA contamination estimates 


Although rigorous measures to prevent contamination of the experi- 
ments by DNA from present-day humans were implemented at all 
laboratory steps, it is impossible to completely prevent contamination 
because bone samples as well as reagents may be contaminated before 
they enter the clean-room facility. To estimate the levels of contami- 
nation in the sequences produced we used three approaches (Sup- 
plementary Information section 3). 

First, we estimated the level of mtDNA contamination using 276 
sequence positions where the Denisova mtDNA differs from >99% of 
present-day human mtDNAs. For library SL3003, we observed 7,433 
unique sequences that covered such positions and 7,421 were of the 
Denisova type. For library $L3004 the corresponding numbers were 
5,042 and 5,036, indicating that the mtDNA contamination in the 
libraries is on the order of 0.2% (95% confidence interval (CI): 0.1- 
0.3%) and 0.1% (CI: 0.1-0.3%), respectively. 

Second, we identified sequences that are unique to the Y chro- 
mosome’. If the individual from whom the phalanx derives is female, 
the number of such sequences represents the extent of male DNA 
contamination. We found zero and three such Y chromosomal 
sequences in the two libraries, respectively, whereas 1,449 and 696 are 
expected if the individual is male. Thus, the bone derives from a female 
and male DNA contamination in the two libraries is on the order of 
0.00% (CI: 0.00-0.25%) and 0.43% (CI: 0.09-1.26%), respectively. 

Third, to estimate the extent of nuclear DNA contamination we 
used one library to identify positions where the Denisova individual 
carries an ancestral, that is, chimpanzee-like, sequence variant that 
among present-day humans is derived and not known to vary. We 
then examined sequences that map at these positions in the other 
library and determined if they carry the ancestral sequence or the 
derived sequence. Observation of a derived sequence in the second 
library could be due to one of three possibilities: that the DNA frag- 
ment in question comes from present-day human contamination; 
that the Denisova individual is heterozygous at the position in ques- 
tion; or that there has been a sequencing error. We implemented a 
maximum likelihood method that uses the number of independent 
observations of ancestral and derived states across positions to co- 
estimate contamination along with heterozygosity and sequencing 
error as nuisance parameters (Supplementary Information section 3). 
From this analysis, both libraries are inferred to have contamination 
rates of less than 1%. 


Ancestral features and duplications 


The Denisova draft genome sequence allows features that are ancestral 
in the Denisova genome and derived in present-day humans to be 
identified. We previously described a set of 10.5 million single nucleo- 
tide differences and about half a million insertion/deletions (indels) 
inferred to be due to changes that occurred on the human lineage since 
the split from the common ancestor with the chimpanzee’. Of these, 
4,267,431 (40.5%) single nucleotide differences and 105,372 (22.0%) 
indels are covered by the Denisova sequences. We identified 129 
inferred amino substitutions and 14 indels in the coding sequences 
of genes where the Denisova individual carries the ancestral alleles at 
positions where present-day humans carry derived alleles and are not 
known to vary (Supplementary Information section 4). We also iden- 
tified 90 such sites in 5’ untranslated regions (UTRs), 392 in 3’ UTRs, 
two in microRNA genes and 104 in human accelerated regions. When 
we compared the Denisova and Neanderthal genomes we found that 
they carry the same assigned state at single nucleotide differences in 
87.9% of the ancestral positions and 97.7% of the derived positions. 
The results for indels are similar: 87.6% for ancestral states and 98.6% 
for the derived states (Supplementary Table 4.3). 

We analysed the segmental duplication content of the Denisova 
genome by detecting regions with an excess read depth (Supplemen- 
tary Information section 5). In a three-way comparison of Denisova, 
Neanderthal and present-day human genomes, we found an excess of 
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private Denisova duplications (2.27 megabases (Mb)) compared with 
duplications that were private in Neanderthals (0.60 Mb) or present- 
day humans (1.32 Mb). These regions were identified based on sig- 
natures of both excess read depth and increased sequence divergence, 
making them unlikely to be artefacts. We also identified two regions 
where the duplication architecture of Denisova is more similar to that 
of chimpanzee than to that of either Neanderthals or present-day 
humans, including two chromosomal regions associated with neuro- 
logical disease in humans: spinal muscular atrophy on 5q13 (includ- 
ing SMN2, one of the most recent gene duplications in the human 
lineage) and neuropsychiatric disease on 16p12.1. 


Relationship to Neanderthals and modern humans 


A fundamental question is whether the Denisova individual is an out- 
group to Neanderthals and modern humans, as the mtDNA suggests”, 
whether it is a sister group to Neanderthals or to modern humans, or 
whether it falls within the range of variation of either of these two 
groups. We addressed this by estimating the divergence between the 
Denisova and the human genome reference sequence as a fraction of 
the divergence between present-day humans and the common ancestor 
shared with the chimpanzee. To do this, we scored the frequency with 
which the Denisova genome carries the human versus the chimpanzee 
state at positions where the human and chimpanzee reference genomes 
differ; assuming constant evolutionary rates (Supplementary Informa- 
tion section 2). We restricted this analysis to the parts of the human 
reference genome that are of African ancestry’’ as gene flow from 
Neanderthals to non-Africans* could otherwise complicate these ana- 
lyses. The Denisova genome diverged from the reference human genome 
11.7% (CI: 11.4-12.0%) of the way back along the lineage to the human- 
chimpanzee ancestor. For the Vindija Neanderthal, the divergence is 
12.2% (CI: 11.9-12.5%). Thus, whereas the divergence of the Denisova 
mtDNA to present-day human mtDNAs is about twice as deep as that of 
Neanderthal mtDNA”, the average divergence of the Denisova nuclear 
genome from present-day humans is similar to that of Neanderthals. 

A possible explanation for the similar divergence of the Denisova 
individual and Neanderthals from present-day Africans is that they 
both descend from a common ancestral population that separated 
earlier from ancestors of present-day humans. Such a scenario would 
predict a closer relationship between the Denisova individual and 
Neanderthals than between either of them and present-day humans. 
To test this prediction, we estimated the divergence between pairs of 
seven ancient and modern genomes (Denisova, Neanderthals, French, 
Han, Papuan, Yoruba and San), using an approach where we correct 
for error rates in each genome based on the assumption that each has 
the same number of true differences from chimpanzee (Supplementary 
Information section 6). The average divergence between Denisova and 
Vindija Neanderthals is estimated to be 9.84% of the way to the 
chimpanzee-human ancestor; that is, less than the average 12.38% 
divergence of both from present-day Africans. Assuming 6.5 million 
years for human-chimpanzee divergence, this implies that DNA 
sequences of Neanderthals and the Denisova individual diverged on 
average 640,000 years ago, and from present-day Africans 804,000 years 
ago. 
To analyse further the relationship of the Denisova individual and 
Neanderthals, we aligned Denisova, Neanderthal and Yoruba sequences 
to the chimpanzee genome, picked a single sequence at random to rep- 
resent each group, and examined sites where two copies of a derived and 
one copy of an ancestral allele were observed. Sequencing errors are 
expected to make a negligible contribution at such sites. The number of 
sites where the Denisova individual and Neanderthal cluster to the exclu- 
sion of the Yoruba and chimpanzee is 46,362, compared with an average 
of 22,012 sites for the other two possible patterns (Yoruba and Denisova, 
or Yoruba and Neanderthal). This excess of sites where Denisova and 
Neanderthal cluster supports the view that the Denisova individual and 
Neanderthals share a common history since separating from the ancestors 
of modern humans (Supplementary Information section 6). 
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A Neanderthal-specific bottleneck 


The fact that the Denisova nuclear genome on average shares a more 
recent common ancestor with Neanderthal than with present-day 
humans raises the question of whether the overall DNA sequence 
divergence of the Denisova individual falls inside the group morpho- 
logically and geographically defined as Neanderthals, or if it repre- 
sents a sister group to Neanderthals. 

To investigate this question, we took advantage of the fact that in 
addition to the three individuals from Vindija Cave, Croatia, from which 
most of the Neanderthal genome sequences were produced, we have 
determined nuclear DNA sequences from three further Neanderthal 
individuals from Russia, Spain and Germany’*. Of these, the 60,000- 
70,000-year-old skeleton of a Neanderthal child found in Mezmaiskaya 
Cave, Russia, is both oldest and geographically closest to the Denisova 
individual. Using the 56 Mb of autosomal DNA sequences determined 
from this specimen’, we estimate that the DNA sequence divergence 
between the Vindija and Mezmaiskaya Neanderthals corresponds to 
a date of 140,000 + 33,000 years ago (Supplementary Information 
section 6) (Fig. 1). This remarkably low divergence—which is about 
one-third of the closest pair of present-day humans that we analysed— 
is in agreement with the observation that diversity among Neanderthal 
mtDNAs is low relative to present-day humans” and indicates that the 
Vindija and Mezmaiskaya Neanderthals descend from a common 
ancestral population that experienced a drastic bottleneck since sepa- 
rating from the ancestors of the Denisova individual. 

To understand further the bottleneck in the history of Vindija and 
Mezmaiskaya Neanderthals, we examined four-way alignments of the 
Vindija Neanderthal genome sequence, the Mezmaiskaya Neanderthal, 
the Denisova individual and the chimpanzee genome. At transversion 
substitutions where two copies of the derived alleles are observed, we 
detect 924 substitutions that cluster the Vindija and Mezmaiskaya 
Neanderthals, 80 that cluster Vindija and Denisova, and 81 that 
cluster Mezmaiskaya and Denisova. This corresponds to at least a 
65% probability that the DNA sequences in the Neanderthals share 
a common ancestor more recently than their split from the ancestor of 
the Denisova individual (Supplementary Information section 7). It is 
much higher than the 15-20% probability associated with the “Out of 
Africa’ bottleneck common to present-day non-A fricans™. If we replace 
the Mezmaiskaya Neanderthal in this analysis with a Neanderthal from 
El Sidron, Spain, or from Feldhofer, Germany, results are qualitatively 
similar although numbers are smaller (Supplementary Information 
section 7). Thus, we conclude that late Neanderthals across a broad 
geographical range have a population history distinct from that of the 
Denisova individual in that they share a strong population bottleneck 
not experienced by the ancestors of the Denisova individual. We call 


Denisova 
Mezmaiskaya 1 
Vindija 33.26 
Vindija 33.25 
Vindija 33.16 
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Yoruba 


Figure 1 | A neighbour-joining tree based on pairwise autosomal DNA 
sequence divergences for five ancient and five present-day hominins. 
Vindija 33.16, Vindija 33.25 and Vindija 33.26 refer to the catalogue numbers of 
the Neanderthal bones. 
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the group to which this individual belonged Denisovans in analogy to 
Neanderthals, as Denisovans are described for the first time based on 
molecular data from Denisova Cave just as Neanderthals were first 
described based on skeletal remains retrieved in the Neander Valley 
in Germany. 


No Denisovan gene flow into all Eurasians 


We have previously shown that Vindija Neanderthals share more 
derived alleles with non-Africans than with Africans, consistent with 
Neanderthals contributing 1-4% of the genomes of present-day 
humans across Eurasia®. To investigate the extent to which the 
Denisova individual shares this pattern, we examined alignments of 
sets of four genomes, each consisting of an African (Yoruba or San), 
a Eurasian (French or Han), an archaic hominin (Neanderthal or 
Denisovan) and the chimpanzee. We randomly sampled one allele from 
each of the three hominins, and counted all transversion differences 
between the African and the Eurasian where the archaic individual 
carries the derived allele (the ‘D statistics’ of ref. 8). Neanderthals match 
the French genome on average 4.6 + 0.7% more often than they match 
the Yoruba genome (Table 1). Although the Denisova individual also 
matches the French more than the Yoruba genome, this skew is signifi- 
cantly less strong at 1.8+0.5%. The estimates of D statistics were 
quantitatively consistent (within two standard deviations) for all other 
choices of Eurasian and African populations (Table 1). These findings 
indicate that the archaic component of the Eurasian gene pool is less 
closely related to the Denisova individual than to Neanderthals. 


Wealso examined 13 genomic regions that we previously identified 
as candidates for a contribution of archaic genetic material into non- 
Africans, based on their deeper genetic divergences in non-Africans 
than in Africans*. Using ‘tag SNPs’ that are informative about whether 
a haplotype is from the lineage unique to non-Africans, we find that 
the Denisova individual matches the deeply diverged non-African 
haplotype in 6 cases, whereas Neanderthals do so in 11 cases (Su- 
pplementary Information section 7). Thus, both Neanderthals and 
Denisovans are more related than would be expected by chance to 
these genomic segments, but the signal in Denisovans is weaker. 

These analyses indicate that Neanderthals are more closely related 
than Denisovans to the population that contributed to the gene pool 
of the ancestors of present-day Eurasians. The fact that Eurasians 
share some additional affinity with the Denisova individual relative 
to Africans is compatible with a scenario in which Denisovans shared 
some of their history with Neanderthals before the gene flow from 
Neanderthals into modern humans occurred. 


Denisovan gene flow into the ancestors of Melanesians 


Although the Denisova individual derives from a population that was 
not directly involved in the gene flow from Neanderthals to Eurasians, 
it is possible that Denisovans admixed with the ancestors of present- 
day people in some parts of the Old World. To investigate this, we 
analysed the relationship of the Denisova genome to the genomes of 
938 present-day humans from 53 populations that have been geno- 
typed at 642,690 single nucleotide polymorphisms (SNPs)**. We 


Table 1 | Sharing of derived alleles between present-day and archaic hominins 


Sample Hy Sample Ho Source of data D(Hi, Ho, Neanderthal, chimpanzee) D(Hj, Ho, Denisova, chimpanzee) 
for H,; and H 
, ‘ NBABA NABBA D(%) s.e.(%) Z-score NBABA NABBA D(%) s.e.(%) Z-score 
Eurasian/Eurasian* 
French Han Ref. 8 17,214 17,602 -1.1 0.8 -14 27,250 27,265 0.0 0.6 0.0 
Karitiana Sardinian This study 1,116 1,085 1.4 2.1) 0.7 1,559 1,627 —2.1 1.8 -1.2 
Karitiana Cambodian This study 1,683 1,707 —-0.7 1.8 —0.4 2371 2460 -1.8 5 -1.2 
Karitiana Mongolian This study 1,128 1,195 -29 2.2 -13 1,765 1,742 0.7 8 0.4 
Sardinian Cambodian This study 2,592 2,670 —1.5 1.5 -1.0 3,935 3,925 0.1 2 0.1 
Sardinian Mongolian This study 1,966 2,027 -1.5 1.6 -0.9 3,036 3,057 -0.3 3 -0.3 
Cambodian Mongolian This study 2,81 2,804 0.1 1.4 0.1 4,442 4,342 11 2 1.0 
African/African* 
San Yoruba Ref. 8 23;690 23,855 -03 0.6 =0.6 39,042 39,019 0.0 05 0.1 
Melanesian/Melanesian* 
Papuan2 Bougainville This study 3,35 3,284 1.0 1.3 0.8 5,319 5,140 1.7 ill 1.5 
Eurasian/African* 
French San Ref. 8 25,242 22,982 4.7 0.6 76+ 39,838 38,495 Ly 05 3.44 
French Yoruba Ref. 8 21,794 19,890 4.6 O07 6.97 34,262 33,078 18 05 3.6 
Han San Ref. 8 25,081 22,470 5.5 0.6 8.57 38,815 37,439 1g 05 3.44 
Han Yoruba Ref. 8 21,741 19,412 57 0.7 7.94 33,182 32,184 1.5 05 2.8 
Karitiana Mbuti This study 1,577 1,473 3.4 L9 18 2,368 2,360 0.2 5 O.1 
Sardinian Mbuti This study 2,562 2,400 33 LS 22 4,028 3,784 3.1 2 2.6 
Cambodian Mbuti This study 4,235 3,641 75 1.2 6.57 6,329 5,850 3.9 ) 4.0+ 
Mongolian Mbuti This study 3,077 2,765 53 14 3 oF 4,514 4,505 0.1 1 0.1 
Eurasian/Melanesian* 
French Papuanl Ref. 8 15,523 15548 —-0.1 0.8 =O. 23,509 25470 -40 07 =5/¢ 
Han Papuanl Ref. 8 15,059 14,677 13 0.9 15 22,262 24198 -—-4.2 0.7 —5.8f 
Karitiana Papuan2 This study 1522 1658 -43 i9 22 2,201 2,641 -9.1 6 —5.8t 
Karitiana Bougainville This study 1,577 1,717 —-43 1.8 —24 2,229 2,671 —9.0 5 —5.9+ 
Sardinian Papuan2 This study 2447 2647 —-3.9 1.5 =2.6 3,714 4150 -55 2 —4.5+ 
Sardinian Bougainville This study 2,531 2,762 -44 15 =3.0 3,877 4336 -5.6 i, —4,.9+ 
Cambodian Papuan2 This study 3/13 3891 —-23 13 =—1L8 5,457 6272 -69 1, 6.57 
Cambodian Bougainville This study 3,847 3,994 -1.9 1.2 —136 5,751 6333 -48 1,0 —4,7+ 
Mongolian Papuan2 This study Z2ies 2852 —12 15 -0.8 4,192 4758 -63 Le —-5.3f 
Mongolian Bougainville This study 2,813 3,066 —-—43 15 =-29 4,234 4847 -68 1. —6.07 
Melanesian/African* 
Papuanl San Ref. 8 21,985 20,366 3.8 07 5.14 35,923 32,841 45 0.6 2t 
Papuanl Yoruba Ref. 8 19,107 17,646 40 08 4.9+ 30,995 28,186 47 0.6 7 At 
Papuan2 Mbuti This study 3,832 3,324 71 1.3 54+ 6,124 5255 78 1, 72% 
Bougainville Mbuti This study 4,216 3,596 79 12 6.8 6,498 5,633 rm i, 6.77 
We present the D statistic D(Hi, Ho, X, chimpanzee), the normalized difference between the number of sites at which the derived allele in an archaic read from X matches human sample H; (Ngaga) and human 


sample Ho (Ngaga); thus, its value is D = (Neasa — Naspa)/(Neasa + Nagea). We restrict to autosomal transversion substitutions, compute standard errors (s.e.) from a block jackknife, and highlight (dagger symbol) 
the Dstatistics that are more than Z > 3 s.d. from zero. Both Neanderthals and Denisovans match Eurasians more than the Africans, but the signals are consistently and significantly stronger when X = Neanderthal 
than when X = Denisova. The slight numerical differences with Table 4 of ref. 8 are due to differences in the data filtering. Here we restrict to comparisons of present-day human samples that were sequenced by the 
same protocol (the five individuals sequenced in ref. 8, or the seven in this study); Supplementary Table 8.2 presents the complete set of pairwise comparisons. 


* Comparison. 
+D statistics that are more than Z> 3 s.d. from zero. 
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scored each of these present-day humans based on their relative 
proximity to Neanderthals and the Denisova individual at positions 
where we have high-quality data for both the Neanderthal and 
Denisova genomes (Supplementary Information section 8). Using 
the means of the 53 populations, the first two principal components 
separate the populations into three groups (Fig. 2): first, the 7 sub- 
Saharan African populations; second, a group of 44 non-African 
populations as well as one north African group; and third, Papuan 
and Bougainville populations from Melanesia. When individuals 
from selected populations are analysed separately, the Papuan and 
Bougainville islanders remain distinct from almost all individuals 
outside Africa (Supplementary Fig. 8.1b). Thus, with respect to their 
relationship to Neanderthals and Denisovans, the Melanesian popu- 
lations stand out relative to other non-African populations. 

To explore this further, we analysed the relationship of the Denisova 
genome to the genomes of five present-day humans that we previously 
sequenced to about fivefold coverage® (a Yoruba and a San genome 
from Africa, a French genome from Europe, a Han genome from China 
and a Papuan genome from Melanesia), as well as seven present-day 
humans that we sequenced to 1-2-fold coverage for this study (a Mbuti 
genome from Africa, a Sardinian genome from Europe, a Mongolian 
genome from Central Asia, a Cambodian genome from South-East 
Asia, an additional Papuan genome from Melanesia, a Bougainville 
islander genome from Melanesia, and a Karitiana genome from South 
America) (Supplementary Information section 9). We used the D 
statistic® to test if various pairs of present-day humans share equal 
numbers of derived alleles with the Denisova individual. To do this, 
we restricted comparisons to pairs of present-day humans sequenced 
at the same time to minimize the chance that differences in sample 
processing could affect the results. We find that the fivefold coverage 
Papuan individual shares 4.0 + 0.7% more alleles with the Denisova 
individual than does the French individual, and we observed a similar 
skew in all 10 comparisons of Melanesian and other non-African 
populations (Table 1). When we stratified the data by base substitution 
class and chromosome, the D statistics are qualitatively unchanged 
(Supplementary Information section 10). Similarly, the D statistics 
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Figure 2 | Relationship of present-day populations to the Denisova 
individual and Neanderthals based on 255,077 SNPs. Principal component 
analysis of the means of 53 present-day human populations projected onto the 
top two principal components defined by Denisova, Neanderthal and 
chimpanzee. The seven ‘African’ populations are San, Mbuti, Biaka, Bantu 
Kenya, Bantu South Africa, Yoruba and Mandenka; the “Non-African’ 
populations are 44 diverse groups from outside Africa except for Papuan and 
Bougainville islanders. 
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are consistent for all depths of read coverage, indicating that mapping 
errors, for example due to segmental duplications, are not likely to 
explain these results. Finally, differences in sequencing error rate across 
samples cannot explain the observed D statistics (Supplementary 
Information section 10). 

Under the assumption that gene flow explains these observations, 
we determined the direction of this gene flow by asking whether 
Melanesians and other Eurasians share derived alleles with Africans 
equally often. If the gene flow was entirely into the ancestors of the 
Denisovan individual, we would not expect this to affect the relation- 
ship of Africans to Melanesians and other Eurasians and thus we would 
expect them to share derived alleles equally often with Africans. 
However, we find that derived alleles in Africans match Melanesians 
3.4 + 0.4% less often than other non-Africans (Z = 10.8). Because this 
skew is seen without using Denisovan data it cannot be explained by 
gene flow into Denisovans or, for example, by contamination of the 
Denisova sample by present-day Melanesian DNA. Thus, at least some 
of the putative gene flow must have been into Melanesians (Sup- 
plementary Information section 8). 

When we compare the skew in the fraction of derived alleles shared 
with the two archaic hominins to what would be expected for individuals 
of 100% Neanderthal or Denisova ancestry, respectively (Supplemen- 
tary Information section 8 and ref. 8), we estimate that 2.5 + 0.6% of the 
genomes of non-African populations derive from Neanderthals, in 
agreement with our previous estimate of 1-4%°*. In addition, we estimate 
that 4.8 + 0.5% of the genomes of Melanesians derive from Denisovans. 
Altogether, as much as 7.4 + 0.8% of the genomes of Melanesians may 
thus derive from recent admixture with archaic hominins. 


A model of population history 


To understand the implications of the relationships observed among 
the Denisova individual, the Neanderthals and present-day humans, we 
fit the D statistics described in the previous sections to a parameterized 
model of population history. The D statistics for the Denisova indi- 
vidual differ in two important ways from those for the Neanderthal. 
First, the Denisova individual shares fewer derived alleles with either the 
French or Han Chinese populations than do the Neanderthals. Second, 
the Denisova individual shares more derived alleles with the Papuans 
than do the Neanderthals. We are able to fit the data with a model that 
assumes the Denisovans are a sister group of Neanderthals with a 
population divergence time of one-half to two-thirds of the time to 
the common ancestor of Neanderthals and humans. After the diver- 
gence of the Denisovans from Neanderthals, there was gene flow from 
Neanderthals into the ancestors of all present-day non-Africans. Later 
there was admixture between the Denisovans and the ancestors of 
Melanesians that did not affect other non-African populations. This 
model is illustrated in Fig. 3 and is described in detail in Supplementary 
Information section 11. 

Other, more complex models could also explain the data. For 
example, a model that invokes only gene flow from Denisovans to 
Melanesian ancestors outside Africa and assumes four subpopulations 
in Africa that existed between the times of the origin of Denisovan and 
Neanderthal ancestors and the ancestors of present-day Eurasians 
could also fit the data (Supplementary Fig. 11.4). However, because 
barriers to gene flow between such subpopulations would have to 
persist for hundreds of thousands of years to create the observed 
patterns, such a model is less plausible on biological grounds than a 
model that invokes two instances of gene flow outside Africa. 


Discordance of mtDNA and nuclear histories 

The population history indicated by the nuclear genome is different 
from that indicated by the mtDNA phylogeny. There are two possible 
explanations for this. One is that the mtDNA lineage was introduced 
into Denisovan ancestors by admixture from another hominin lineage 
for which we have no data. The other is that the discordance is the 
result of ‘incomplete lineage sorting’, that is, the random assortment 
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Figure 3 | A model of population history compatible with the data. N 
denotes effective population size, t denotes time of population separation, f 
denotes amount of gene flow and tcp denotes time of gene flow. 


of genetic lineages due to genetic drift which may have allowed a 
divergent mtDNA lineage to survive in Denisovans by chance while 
becoming lost in Neanderthals and modern humans. A large ancestral 
population size makes incomplete lineage sorting more likely to 
occur. In Supplementary Information section 11, we show that given 
reasonable assumptions about the size of the ancestral populations, 


Figure 4 | Morphology of the Denisova molar. a, b, Occlusal (a) and mesial 
(b) views. c, Comparison of the Denisova molar to diverse third molars, in a 
biplot of the mesiodistal and buccolingual lengths (in mm). AMH, anatomically 
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the discordance of the mtDNA phylogeny with that indicated by the 
nuclear DNA can be explained either by a small amount of admixture 
from another archaic hominin or by incomplete lineage sorting. Thus, 
the data do not allow us to favour one hypothesis over the other. 


A tooth from Denisova Cave 


In 2000, a hominin tooth was discovered in layer 11.1 of the south 
gallery of Denisova Cave (Fig. 4a, b). The tooth is from a young adult 
and therefore from another individual than the phalanx which stems 
from a juvenile (Supplementary Information section 12). To elucidate 
the relationship of the tooth to the individual from which the phalanx is 
derived, we extracted DNA from 50 mg of dentin from the root of the 
tooth and prepared a sequencing library (Supplementary Information 
section 13). About 0.17% of random DNA sequences determined 
from this library aligned to the human genome, whereas the rest is 
likely to represent microbial contamination common in ancient bones. 
We therefore used a novel DNA capture approach* to isolate mtDNA 
sequences from the sequencing library. A total of 15,094 sequences 
were identified which allowed the complete mtDNA genome to be 
assembled at an average coverage of 58-fold. This sequence differs at 
two positions from the mtDNA of the phalanx whereas it differs at 
about 380 positions from both Neanderthal and present-day humans. 
The time since the most recent common ancestor of the two mtDNAs 
from Denisova Cave is estimated to be 7,500 years, with a 95% upper 
bound of 16,000 years (Supplementary Information section 13). We 
conclude that the tooth and the phalanx derive from two different 
individuals that are probably from the same hominin population. 


Morphology of the Denisova molar 

The tooth is an almost complete left, probably third, but possibly 
second, upper molar (Fig. 4b). The crown is trapezoidal and tapers 
strongly distally, with bulging lingual and buccal walls giving the 
tooth an inflated appearance (Supplementary Information section 
12). The roots are short but robust and strongly flaring. 

Overall, the tooth is very large (mesiodistal diameter, 13.1 mm; 
buccolingual, 14.7 mm). As a third molar, it is outside the range of 
normal size variation of all fossil taxa of the genus Homo, with the 
exception of H. habilis and H. rudolfensis, and comparable to 
Australopithecines (Fig. 4c). Compared to second molars, it is larger 
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modern humans; SH, Sima de los Huesos. Supplementary Fig. 12.1 presents a 
similar comparison to second molars. 
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than Neanderthals or early modern humans, but similar to H. erectus 
and H. habilis (Supplementary Fig. 12.1). 

Besides size, it is also distinguished from most Neanderthal third 
molars by the absence of hypocone reduction, and from both second 
and third Neanderthal molars by the presence of a large talon basin 
and the strong flare of the crown. Furthermore, it lacks the lingual 
hypocone projection seen in all Neanderthal first and many second 
molars, and has strongly diverging roots, unlike the closely spaced and 
frequently fused roots of Neanderthals. 

It is of particular interest to compare the Denisova molar to Middle 
Pleistocene hominins from China, where H. erectus and other archaic 
forms, sometimes interpreted as H. heidelbergensis, may have survived 
until recently. Unfortunately, very few of these fossils preserve third 
upper molars. Of the few examples that are available, most differ from 
the Denisova molar by their strongly reduced size. Second molars are 
more frequent than third molars, and most have a trapezoidal shape 
like Denisova, but they do not have the lingually skewed position of the 
hypocone and metacone and the strong basal flare of the crown. 

The Denisova molar supports the DNA evidence that the Denisovan 
population is distinct from late Neanderthals as well as from modern 
humans. In fact, the primitive traits of the Denisova tooth suggest that 
Denisovans may have been separated from the Neanderthal lineage 
before Neanderthal dental features are documented in Western 
Eurasia (>300,000 years Bp) (Supplementary Information section 12), 
although we cannot exclude the possibility that the Denisovan dental 
morphology results from a reversion. 


Stratigraphy and dating 

The small size of both the phalanx and the tooth precludes direct 
radiocarbon dating. We instead dated seven bone fragments found 
close to the hominin remains in layer 11 in the east and south galleries. 
To ensure that they were associated with human occupation of the 
cave we chose bones that have evidence of human modification, 
including a rib with regular incisions and a bone projectile point blank 
generally associated with Upper Palaeolithic cultural assemblages. In 
the south gallery, where modified bones were not available, we used 
herbivore bones (Supplementary Information section 12). 

Four of the seven dates are infinite dates older than 50,000 years BP 
(uncalibrated), whereas three are finite dates between 16,000 and 
30,000 years BP (Supplementary Table 12.1). The rib with incisions 
and the projectile point blank are about 30,000 and 23,000 years Bp, 
respectively. Together with three previous dates” this shows that layer 
11 contains cultural remains from at least two different time periods, 
one period older than 50,000 years Bp and one more recent period. 
However, the stratigraphy is complicated by the discovery of a wedge- 
shaped area close to the area where the phalanx was found that is likely 
to be disturbed (Supplementary Information section 12). Hominin 
remains large enough to allow direct radiocarbon dates may even- 
tually be discovered in the cave, but a reasonable hypothesis is that the 
phalanx and molar belong to the older occupation. 


Discussion 


The molecular preservation of the Denisova phalanx is exceptional in 
that the fraction of endogenous relative to microbial DNA is about 
70%. By contrast, in all Neanderthal remains studied so far the relative 
abundance of endogenous DNA is below 5%, and typically below 1%. 
Furthermore, the average length of hominin DNA fragments in the 
Denisova phalanx is 58 base pairs (bp) (SL3003) and 74 bp (SL3004) 
in spite of the enzymatic treatment that removes uracil residues and 
decreases the average fragment size, whereas in most well-preserved 
Neanderthal samples it is 50 bp or smaller without this treatment. 
Thus, although many Neanderthals are preserved under conditions 
apparently similar to those in Denisova Cave, the Denisova phalanx is 
one of few bones found in temperate conditions that are as well pre- 
served as many permafrost remains*””*. It is not clear why this is. It is 
not due to some condition that affects all hominin remains in 
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Denisova Cave because the fraction of endogenous DNA in the tooth 
is 0.17%; that is, typical of other Late Pleistocene hominin remains. It 
is possible that a rapid desiccation of the tissue after death, which 
would limit degradation of the DNA by endogenous enzymes as well 
as microbial growth, has allowed this exceptional preservation. 

The Denisova individual and the population to which it belonged 
carry some exceptionally archaic molecular (mtDNA) as well as mor- 
phological (dental) features. Nevertheless, the picture that emerges 
from analysis of the nuclear genome is one where the Denisova popu- 
lation is a sister group to Neanderthals. Three possibilities could 
account for how such archaic features have come to be present in 
Denisovans. One possibility is that these features were retained in 
Denisovans but became lost in modern humans and Neanderthals. 
A second, not mutually exclusive, possibility is that they entered the 
Denisova population through gene flow from some even more 
diverged hominin. Although such gene flow cannot be detected with 
the current mtDNA and nuclear DNA data, further sequencing of 
other hominin remains may in the future allow testing for it. A third 
possibility that could account for the apparently archaic dental mor- 
phology, but not the mtDNA, is a reversal to ancestral traits. 

After they diverged from one another, Denisovans and Neanderthals 
had largely separate population histories as shown by a number of 
observations. First, patterns of allele sharing indicate that Denisovan 
ancestors did not contribute genes at a detectable level to present-day 
people all over Eurasia whereas Neanderthals did*. Thus, Neanderthals 
at some point interacted with ancestors of present-day Eurasians inde- 
pendently of Denisovans. Second, the genetic diversity of Neanderthals 
across their geographical range in the last thirty or forty thousand years 
of their history was extremely low, indicating that they experienced one 
or more strong genetic bottlenecks independently of the Denisovans. 
Third, our results indicate that Denisovans but not Neanderthals con- 
tributed genes to ancestors of present-day Melanesians. Fourth, the 
dental morphology shows no evidence of any derived features seen in 
Neanderthals. In fact, dental remains from the Sima de los Huesos of 
Atapuerca, for which ages between 350,000 and 600,000 years have been 
proposed’, already carry Neanderthal-like morphological features 
that are not seen in the Denisova molar. 

An interesting question is how widespread Denisovans were. A 
possibility is that they lived in large parts of East Asia at the time 
when Neanderthals were present in Europe and western Asia. One 
observation compatible with this possibility is that Denisovan rela- 
tives seem to have contributed genes to present-day Melanesians but 
not to present-day populations which currently live much closer to 
the Altai region such as Han Chinese or Mongolians (Table 1). Thus, 
they have at least at some point been present in an area where they 
interacted with the ancestors of Melanesians and this was presumably 
not in southern Siberia. Further studies of both molecular and mor- 
phological features of hominin remains across Asia should clarify how 
widespread Denisovans were and how they were related to archaic 
hominins other than Neanderthals. 

The Denisova individual belongs to a hominin group that shares a 
common ancestor with Neanderthals but has a distinct population 
history. We define this group based on genomic evidence and call it 
Denisovans, but refrain from any formal Linnaean taxonomic desig- 
nations that would indicate species or subspecies status for either 
Neanderthals or Denisovans. In our view, these results show that on 
the Eurasian mainland there existed at least two forms of archaic 
hominins in the Upper Pleistocene: a western Eurasian form with 
morphological features that are commonly used to define them as 
Neanderthals, and an eastern form to which the Denisova individuals 
belong. In the future, when more complete genomes from these and 
other archaic hominins will be sequenced from remains that allow 
more morphological features to assessed, their relationships will 
become even better understood. This will be an important endeavour 
as the emerging picture of Upper Pleistocene hominin evolution is one 
in which gene flow among different hominin groups was common. 


23/30 DECEMBER 2010 | VOL 468 | NATURE | 1059 


©2010 Macmillan Publishers Limited. All rights reserved 


ARTICLE 


METHODS SUMMARY 


The thirteen sections of the Supplementary Information provide a full description 
of the methods. 
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A population-specific HTR2B stop codon 
predisposes to severe impulsivity 
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Colin A. Hodgkinson!, Liliana Dell’Osso!®, Jaana Suvisaari’, 


Emil Coccaro", Richard J. Rose”, Leena Peltonent, Matti Virkkunen®? & David Goldman 


Impulsivity, describing action without foresight, is an important feature of several psychiatric diseases, suicidality and 
violent behaviour. The complex origins of impulsivity hinder identification of the genes influencing it and the diseases 
with which it is associated. Here we perform exon-focused sequencing of impulsive individuals in a founder population, 
targeting fourteen genes belonging to the serotonin and dopamine domain. A stop codon in HTR2B was identified that is 
common (minor allele frequency > 1%) but exclusive to Finnish people. Expression of the gene in the human brain was 
assessed, as well as the molecular functionality of the stop codon, which was associated with psychiatric diseases marked 
by impulsivity in both population and family-based analyses. Knockout of Htr2b increased impulsive behaviours in 
mice, indicative of predictive validity. Our study shows the potential for identifying and tracing effects of rare alleles in 
complex behavioural phenotypes using founder populations, and indicates a role for HTR2B in impulsivity. 


Impulsivity is a broad term describing behaviour characterized by 
action without foresight, decreased inhibitory control and a lack of 
consideration of consequences’. Cognitive function, attention and res- 
ponses to reward are factors that are thought to contribute to the trait of 
impulsivity. Although impulsivity can be an adaptive dimension of 
personality, intolerance for delay, disinhibition and the inappropriate 
weighting of contingencies are maladaptive’. The behavioural manifes- 
tations of impulsivity include suicide, addictions, attention deficit 
hyperactivity disorder (ADHD) and violent criminality’, as well as 
antisocial personality disorder (ASPD), borderline personality disorder 
(BPD) and intermittent explosive disorder (IED). These behaviours 
and diagnoses, including impulsivity itself, are moderately heritable*”, 
indicating that it should be feasible to identify genes influencing them. 
Gene identification would also validate the idea that it is possible to 
deconstruct the multi-process origins of impulsivity. Still, studies 
demonstrating that genetic variation predicts impulsivity have been 
relatively sparse’. The fact that few genes influencing impulsivity 
have been discovered could reflect the complexity of the phenotype, 
the nature of the samples or the methodologies used. 

To detect novel alleles that influence impulsivity, we studied 
severely impulsive Finnish criminal offenders and matched controls. 
This study had six components (as charted in Supplementary Fig. 1): 
resequencing and identification of putatively functional variants in 
severe impulsive probands from a founder population; association 
and linkage with impulsive behaviour; population genetics; evaluation 
of cognitive effects of the identified variant; gene expression and 
functionality; and animal studies. 

Exon-centric sequencing was performed on fourteen genes involved 
in dopamine or serotonin function (the genes are listed in Supplementary 
Methods). Dysregulated activity of the monoamine neurotransmitters 


has been implicated in impulsivity both on a neuropharmacological basis 
and a genetic basis via gene knockouts and/or association studies with 
common functional variants. In rats, serotonin and dopamine interact 
in the control of impulsive choice, with differential actions in regions of 
the prefrontal cortex involved’*. The spontaneous impulsivity of rats 
correlates with lower levels of dopamine D2 receptors in the nucleus 
accumbens, predicting liability to compulsive drug seeking and addic- 
tion’’; also, in humans a reduction in D2 receptors, as well as a decrease 
in dopamine release, has been described in the ventral tegmental area of 
cocaine abusers’. The serotonin system has long been implicated in 
impulsivity’*'® and, in particular, impulsive aggression and. suicide. 
Maoa knockout mice have higher levels of monoamines and increased 
aggressive behaviour”’, and a functional variable number tandem repeat 
(VNTR) in the MAOA regulatory region (MAOA-LPR) moderates the 
effect of maltreatment on vulnerability to develop antisocial behaviour in 
humans*"*. It has been shown that a stop codon variant that produces 
complete deficiency of MAOA activity co-segregates with severe impul- 
sivity®. Stress-modified associations with suicidality have been reported 
also for a polymorphism in the serotonin transporter (degenerate repeat 
polymorphic region 5-HTTLPR in SLC6A4)”””*. 

Deep sequencing was recently successfully applied to gene iden- 
tification in rare Mendelian disorders”’. In the domain of complex 
disorders, sequencing revealed putatively functional alleles at a gene 
previously implicated by genome-wide association studies of type I 
diabetes”. Here we attempted to use sequencing to identify novel loci 
contributing to a non-Mendelian phenotype. 


Sequencing Finnish impulsive subjects 
Founder populations can increase power to detect effects of rare alleles. 
At autosomal loci, Finns are equally as diverse as other Europeans, yet a 
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restricted number of founders and isolation have moulded the Finnish 
gene pool’. Many disease alleles are more abundant or unique to 
Finland and conversely some disease alleles common in other 
European populations are rare or nonexistent’*. From the standpoint 
of identifying rare or uncommon alleles with roles in complex pheno- 
types, it is perhaps most important that Finnish ancestry seems to have 
reduced the genetic heterogeneity of various diseases. For seventeen 
Finnish disease alleles, 70% of disease chromosomes (and as many as 
98% for some diseases) were attributable to a single allele”. 

Sequencing was conducted in 96 unrelated Finnish males with 
impulsive behaviour and an equal number of unrelated Finnish males 
free of psychiatric diagnoses (Supplementary Table 1 and Methods). 
Probands had ASPD, BPD or IED and were all violent offenders and 
arsonists who, because of the extreme nature of their crimes, under- 
went inpatient forensic psychiatric examination at the University of 
Helsinki at the time of their initial incarceration. ASPD and BPD 
share genetic risk for impulsive aggression*, which is a central char- 
acteristic of both of these personality disorders. Impulsivity is also key 
to IED, described in the Diagnostic and Statistical Manual of Mental 
Disorders III-R (DSM-III-R) as a failure to resist aggressive impulses. 

The 96 cases were selected for resequencing from a larger cohort of 
Finnish violent offenders comprising 228 cases on the basis that they 
had the highest Brown-Goodwin Lifetime Aggression scores: 23.7 
(standard deviation (s.d). + 4.9) as compared to 8.1 (s.d. + 4.9) in 
controls. Their higher scores were indicators of a life history of aggres- 
sive, violent and impulsive behaviour as behavioural manifestations of 
impulsive temperament. The 96 male controls were free of DSM-III-R 
Axis I and II diagnoses and matched for age, and were selected for 
sequencing for single nucleotide polymorphism (SNP) discovery from 
a larger control cohort comprising 295 individuals. As compared to 
controls, cases also had significantly higher impulsivity (action on the 
spur of the moment) scores on the Karolinska Scales of Personality 
(P <0.0001)**. However, analysis was conducted on a behaviourally 
based phenotype, rather than a measure of temperament, because 
behaviour has repeatedly shown the strongest relationship to biological 
predictors, including genes. Genetic loci previously implicated in 
impulsivity include the MAOA stop codon linked to impulsive beha- 
viour in one Dutch family®, 5-HTTLPR at the serotonin transporter, 
which predicts suicidality’, and the dopamine transporter VNTR, 
which has been associated with ADHD". Impulsive behaviour also can 
be predicted by neurotransmitters and endocrine factors, as illustrated 
by associations with brain serotonin turnover’’, testosterone levels and 
a gene-testosterone interaction’. Animal behavioural pharmacology, 
gene knockout and strain-difference studies all primarily rely on mea- 
sured behaviour. By selecting the most phenotypically extreme pro- 
bands for sequencing, we increased the probability that we would 
detect functional variants altering impulsivity. Clinical and criminal 
records, including evaluation of premeditation and spontaneity of 
crimes, were available for all cases. 

Exonic and promoter regions (comprising 82 kb) were amplified in 
pools of 12 genomic DNAs and sequenced simultaneously at 80x 
coverage on an Illumina Genome Analyser, as described in Methods. 
Sequencing allowed us to identify and accurately estimate frequencies 
of alleles (Supplementary Fig. 8 compares frequencies determined by 
sequencing and genotyping; correlation coefficient r = 0.94). Of 360 
SNPs identified, 44% were known (National Center for Biotechnology 
Information (NCBI) Build128). Frequencies of novel SNPs ranged as 
high as 0.2. Within 37 kb of protein-coding DNA, 25 synonymous 
SNPs, of which 9 were novel, and 26 nonsynonymous SNPs (nsSNPs), 
were detected. Of a total of 22 nsSNPs confirmed by Sanger sequencing, 
10 were novel. 


Association of putatively functional SNPs 


Four nsSNPs were predicted to be functional according to both SIFT 
(sorting intolerant from tolerant) and PolyPhen (polymorphism phe- 
notyping): TPH2 Pro206Ser (rs17110563), DRD1 Ser259T yr, HTR2B 
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Arg388Trp and HTR2B Q20*, a stop codon (Supplementary Table 5). 
These four nsSNPs were genotyped in male Finnish cases and con- 
trols. In a global test of association with an aggregate of potential 
susceptibility variants, these four putatively functional variants were 
twice as common in cases (13.0%) compared to controls (6.5%, 
¢ = 6.76, P= 0.009; Supplementary Table 6). However, this global 
association was driven by HTR2B Q20*. Seventeen out of two- 
hundred and twenty-eight cases were heterozygous for HTR2B 
Q20* compared to 7/295 controls (y* = 7.26, P = 0.007; Supplemen- 
tary Table 6), with an allele frequency in controls of 0.012. Eighty-nine 
pedigrees comprising family members of the violent offenders were 
collected and all were genotyped without pre-selection for phenotype 
or genotype, identifying eight HTR2B Q20* carrier families (Fig. 1 
and Methods). Affected status was defined as presence of ASPD, BPD, 
or IED. The transmission disequilibrium test detected over-transmission 
of Q20* to affected offspring (McNemar 7° =5.0, P= 0.025). Similarly, 
among affected individuals, 6/7 had Q20* transmitted, and among un- 
affected individuals 10/14 did not have Q20* transmitted (Supplemen- 
tary Table 7). From the cumulative binomial distribution, previously 
proposed for linkage of functional loci in families”’, the likelihood of 
16/21 or more linked outcomes was 0.013. 

The HTR2B gene is on 2q36.3-q37.1, a location implicated in early- 
onset obsessive compulsive disorder’ and illicit substance abuse”. 
However, resequencing of HTR2B in these two studies yielded no 
functional variants’”’*. 5-HT2B receptor function in the brain is 
mainly unknown; however, it has been shown that 3,4-methylene- 
dioxymethamphetamine (MDMA, commonly known as ecstasy) 
selectively binds and activates 5-HT2B receptors, inducing serotonin 
release in mouse raphe nuclei, leading to dopamine release in the 
nucleus accumbens and ventral tegmentum”’, and 5-HT2B agonists 
increase serotonin transporter phosphorylation”. 


HTR2B Q20* in humans 


We assessed molecular functionality of HTR2B Q20* by using RNA 
and proteins extracted from lymphoblastoid cell lines, and in addition 
HTR2B expression was measured in multiple brain regions, including 
the frontal cortex, by quantitative polymerase chain reaction (qPCR; 
Methods). Q20* led to variable nonsense-mediated RNA decay and 
blocked expression of the 5-HT2B receptor protein (Fig. 2 and 
Methods). HTR2B is widely expressed in the adult human brain, 
and the frontal lobe is one of the regions where it is most highly 
expressed (Methods and Supplementary Fig. 13). 

HTR2B Q20* is apparently exclusive to Finns. In >3,100 indivi- 
duals representative of worldwide diversity, including the Human 
Genome Diversity Panel (Supplementary Table 8), one additional 
Q20* carrier was observed: a female with a Finnish surname and with 
alcoholism. Indicative of a common origin and founder population 
effect, Q20* was found on a single haplotype background (Sup- 
plementary Fig. 9), and in Finns who were likely to be non-admixed 
(Supplementary Fig. 2). Genetic subisolates have been identified 
within Finland, including families in Eastern Finland. Also, the 
Finnish population apparently was founded by two waves of migra- 
tion: Eastern Uralic founders arrived 4,000 years ago, followed by 
Indo-European speakers 2,000 years later’*. However, it is unlikely 
that the Q20* association is an occult admixture artefact because 
Q20* carriers are common across Finland (in Middle, Western and 
Eastern regions) (Supplementary Fig. 3), and cases and controls did 
not differ in Finnish ancestry (Supplementary Fig. 4 and Methods). 

In the 17 violent offenders (from the case-control study) who carried 
Q20*, impulsivity had a strong role in their crimes. Although convicted 
for a variety of offences including homicide, attempted homicide, 
arson, battery and assault, 94% of their crimes were committed under 
the influence of alcohol. The crimes of the Q20* carrier probands 
occurred as disproportionate reactions to minor irritations and were 
unpremeditated, without potential for financial gain and recurrent. 
From court records up to an average age of 43, Q20* carriers had 
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Figure 1 | HTR2B Q20* co-segregates with impulsivity. Co-segregation of HTR2B Q20* with ASPD and alcohol use disorder (AUD) in eight informative 


families. 


committed an average of 5 violent crimes (range 2-12). The Q20* cases 
tended to fulfil the criteria for ASPD (82%) and IED (57% meeting 3 
out of 4 IED criteria), except that alcoholism, ASPD and BPD are 
exclusionary for IED. Overall, Q20* carriers were cognitively normal 
(mean IQ, 98; s.d., 14.9; range 75-124; two with IQ <87, Wechsler 
Adult Intelligence Scale). 


ATG GCT CTC TCT TAC AGA GTG TCT GAA CTT CAA AGC ACA ATT CCT GAG CAC ATT TTG [C/TJAG 


In temperament—as measured by the Tridimensional Personality 
Questionnaire—Q20* carriers, like others with ASPD, score more 
highly in ‘novelty seeking’ and ‘harm avoidance’, but are otherwise 
more socially attached, empathic and dependent than the other violent 
offenders within the study group (Supplementary Data). Extrapolating 
from the Q20* frequency of 0.012 (and with 174 Q20* carriers directly 
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Figure 2 | HTR2B Q20* blocks protein expression. a, b, cDNA (a) and 
protein locations (b) of HTR2B Q20*. b, Labels I, I, II, IV, V, VI and VII refer 
to the seven transmembrane domains of the 5-HT2B protein and ref. 31 
indicates the position in the 5-HT2B protein of a known, previously identified, 
HTR2B stop codon. c, Variable stop-codon-mediated RNA decay determined 


by cDNA sequencing of 12 Q20* heterozygotes. d, Q20*-mediated blockade of 
5-HT2B protein expression in western blots (validated with three anti-5-HT2B 
antibodies; described in Methods). The 5-HT2B protein ratio was 1.93:1 in 14 
Q20/Q20 homozygotes (mean, 1.78; s.d., 2.24) compared to 14 Q20/Q20* 
heterozygotes (mean, 0.92; s.d., 1.14) (P = 0.03) (Methods). 
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genotyped), 53,000 Finnish males (and as many females) are hetero- 
zygous. However, although few Q20* carriers are criminals, violent 
criminals with Q20* seem to represent some of the most impulsive 
individuals within our violent offender cohort. Among 100-155 homi- 
cides annually in the Finnish population of 5.3 million, there are few 
instances of multiple homicide. In our sample, only three individuals 
were convicted of multiple homicide, and all three carried the Q20* 
allele. 

In our sample, the influence of Q20* was not due to interaction with 
MAOA or serotonin transporter genotypes (data not shown). However, 
it was not possible to rule out other gene interactions, or a modifying 
role of stress. Cerebrospinal fluid monoamine metabolite levels, another 
potential confounding factor, did not differ in Q20* carriers (Sup- 
plementary Data). Therefore, it is unlikely that their impulsivity was 
due to low turnover of serotonin, dopamine or norepinephrine or that 
Q20* substantially affects monoamine metabolism, as does the MAOA 
stop codon’. 

Risk conferred by Q20* seems to be modulated by sex and alcohol. 
Worldwide, suicide accounts for 1.5% of deaths, and Finland has a very 
high suicide rate, especially among men”. In our study, 70% of the 
Q20* male cases showed impulsive suicidal behaviour (for example, 
slashings, hanging attempts, drug overdoses) usually while intoxicated, 
for an average of 3.2 suicide attempts. At age 33.5 (s.d. + 11), 66% had 
at least one life-threatening suicide attempt. It is unknown if suicide 
risk conferred by Q20* extends to the general population, whose mem- 
bers are at lower risk. Males are more likely to commit suicide” and to 
have ASPD and aggression, with a tenfold higher preponderance for 
the early-onset life-course-persistent variant of ASPD*. Moreover, 
alcohol-related violence is known to be higher among males, and the 
serotonin system is thought to contribute to individual differences in 
alcohol-facilitated impulsive aggression™. 

In the violent offender cohort, Q20* carriers were cognitively normal 
and in almost every instance acted out on their impulsivity only when 
inebriated. Having found the association of Q20* with impulsivity in a 
phenotypically extreme sample, it was important to define Q20* fre- 
quency and relationship to behaviour in the wider population, even 
though the only possible follow-up was in Finland. In >6,000 Finns 
ascertained epidemiologically (rather than from the criminal popu- 
lation), the Q20* allele frequency was 0.012 (the same as the frequency 
in controls) (Supplementary Table 9). We identified one Q20* homo- 
zygote, a young male adult with no major medical illness but with a 
history of violent behaviour while under the influence of alcohol 
(Supplementary Methods). 

We followed up the cognitive effects of Q20* in 933 individuals in the 
FinnTwinl2 and FinnTwin16 studies (22 with the stop codon) (Sup- 
plementary Methods). Overall, Q20* carriers were again cognitively 
normal. However, male (but not female) Q20* carriers had significantly 
lower Digit Span Forward (P= 0.002) and Backward (P< 0.001) 
scores, possibly indicating selective impairment in working memory 
(Supplementary Fig. 12), a specific measure of frontal cortical function. 


Htr2b~’~ mice 

Although severe developmental consequences have been observed in 
Htr2b knockout mice, approximately 50% of the mice that survive the 
first postnatal week are apparently normal as adults*. These mice 
were reported to be impulsive in an open field novelty test”. We 
assessed Htr2b knockout mice for five separate measures of impulsivity 
and novelty seeking: delay discounting, activity in a novel environ- 
ment, exposure to a novel object, motor activity after a dopamine D1 
receptor agonist, and decreased latency to eat in the novelty suppressed 
feeding test (hyponeophagia). The Htr2b’~ mice were more impulsive 
and more responsive to novelty in all of these tests (Fig. 3). In rats, both 
impulsivity and response to novelty are predictors for the development 
of addiction-like behaviours**. In addition to their differences in beha- 
viour, Htr2b~/~ males had a threefold elevation in plasma testosterone 
(Fig. 3 and Supplementary Methods). Testosterone (measured in the 
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cerebrospinal fluid of nine heterozygous violent offenders) also seemed 
to be higher in human males carrying Q20* (Supplementary Fig. 11). 
This raises the possibility of an interaction between Q20* and testoster- 
one contributing to impulsive behaviours, as was reported between 
MAOA and testosterone in the same population of Finns that we 
studied here’. 


Discussion 


The aim of this study was to identify genetic variation associated with 
impulsivity, an intermediate phenotype thought to contribute to several 
psychiatric disorders including addictions’’. The goal is to track shared 
genetic factors in these diseases and to contribute to their reconceptua- 
lization on a neurobiological basis. Another purpose of identifying 
genes influencing impulsivity is to determine which of the potential 
aetiologies and types of impulsivity, for example novelty seeking versus 
executive dysfunction*’, are important in human populations. The dis- 
covery of genes influencing impulsive behaviour would validate the idea 
that it is possible to deconstruct the multi-process origins of impulsive 
behaviour. 

HTR2B Q20* is associated and co-segregates with disorders char- 
acterized by impulsivity, reflected in severe crimes committed on the 
spur of the moment—as documented by criminal and _ clinical 
records—and under alcohol intoxication, a condition where impulse 
control is impaired. Thus, the Q20* allele can be regarded as one 
determinant of behavioural variation. However, the presence of 
Q20* was not in itself sufficient: male sex, testosterone level, the 
decision to drink alcohol, and probably other factors such as stress 
exposure, all have important roles. Although relatively common in 
Finland, HTR2B Q20* is unlikely to explain a large fraction of the 
overall variance in impulsive behaviours. There are likely to be many 
pathways to impulsivity in its various manifestations, and the genetic 
association may be present only in the most phenotypically extreme. 

It is unsurprising that a stop codon variant discovered by sequen- 
cing within a founder population is common only in it, and even 
restricted to it. However, this observation is also in line with the 
significance of Q20* as a complete loss of function variant, and with 
the behavioural consequences in some heterozygous carriers. The 
relatively high frequency of Q20* in Finns would thus reflect its status 
as a founder mutation, in contrast with MAOA, COMT and SLC6A4 
(previously known as HTT) alleles that are common worldwide, more 
modestly affect molecular function, and may have counterbalancing 
selective advantages. However, it is highly unlikely that Finns are 
unique in possessing a severe genetic variation leading to impulsivity. 
There is the previous example of the MAOA stop codon found in one 
Dutch family. On average, ten or more heterozygous stop codons 
reside in the genomes of each individual of European ancestry”, 
but perhaps because the source populations from which the probands 
were sequenced did not have founder characteristics, no common 
stop codon had yet been reported for a neurotransmitter gene. 
Although rare variants identified in founder populations are more 
likely to be confined to those populations, analyses of the relationship 
between gene variation and phenotype can be conducted within the 
founder population, identifying new candidate genes and pathways 
influencing behaviour or other aetiologically complex phenotypes. 

As has often been illustrated, the availability of mouse genetic models, 
including gene knockouts, offers an opportunity to test the predictive 
validity of genetic discoveries and to define effects in contexts where 
genetic background and environment are better controlled. The Htr2b 
mouse knockout reveals more general effects of 5-HT2B deficiency on 
behaviour, including effects on novelty seeking. This could be explained 
by pleiotropic actions of the serotonin 2B receptor. On the other hand, 
the effect of the Htr2b knockout on delay discounting seems to validate 
the effect of the Q20* stop codon on impulsivity in people. In people, we 
observed a significant association between the HTR2B Q20* variant and 
impairment in working memory, a neurocognitive process contributing 
or predictive of executive cognitive function. The ability to store and 
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Figure 3 | Increased impulsivity and novelty seeking in Htr2b'~ mice. 

a, b, Increased locomotor response of Htr2b”'~ mice to environmental novelty 
(a) and to a dopamine D1 receptor agonist (SKR 81297) (b). WT, wild type. 
c, Increased number of contacts of Htr2b~/~ mice with a novel object. 

d, Increased delay discounting of H tr2b~'~ mice. LL, large and late hole, nose 


integrate knowledge about possible choices with the current context 
enables the individual to select appropriate cognitive strategies and 
generate optimal reactions. This is coherent with the impulsivity 
observed in HTR2B Q20* cases, who seemed deficient in the ability to 
weigh the consequences of their acts. 

The use of deep sequencing to detect a stop codon associated with 
impulsivity in a founder population reveals a role for the HTR2B gene 
in behaviour. It also indicates that this approach may be applicable to 
other complex behavioural traits, including those diseases for which 
impulsivity is itself an intermediate phenotype. 


METHODS SUMMARY 


Fourteen serotonergic and dopaminergic genes were resequenced (Solexa GA2) 
in 96 Finnish Caucasian male violent offenders and 96 matched controls free of 
psychiatric diagnoses. Exon-centric sequencing was performed by amplifying 108 
regions, for a total of 82 kb, in pools of 12 subjects. HTR2B Q20* was genotyped in 
a Finnish sample of 228 cases and 295 controls, in 89 Finnish families, and in 
5,684 individuals belonging to either a Finnish family data set (N = 1,885), the 
Older Finnish Twin cohort (N = 2,388) or the FinnTwinl6 and FinnTwin12 
studies (N= 1,411), as described in detail in Supplementary Methods, and in 
>3,100 samples representing worldwide diversity. Genotyping was performed 
with a custom 5’ exonuclease assay (Applied Biosystems 7900) using these primers 
and probes: forward primer, 5'’-AGAGTGTCTGAACTTCAAAGCACAA-3’; 
reverse primer, 5'- TCCAGACCAGTTAGAAGAGATAACGT-3’; probe 1, 
5'-AGGTGCTCTGCAAAAT-3’; probe 2, 5’-AGGTGCTCTACAAAAT-3’, 
One-hundred and eight-six ancestry informative markers were genotyped on 
1536-SNP arrays (Illumina). qPCR for HTR2B expression in 13 human brain 
regions was determined by ABI Taqman gene expression assays (Hs01118766 
and Hs00168362). B-actin was the internal control. Total protein and total RNA 
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pokes leading to delivery of a larger but later reward. e, Reduced hyponeophagia 
in 18-h starved Htr2b~/~ mice. f, Male Htr2b~'~ mice have threefold higher 
plasma testosterone levels as compared to control mice. *P < 0.05, **P < 0.01, 
*#* P< 0),001. Error bars are data + standard error. 


were extracted from lymphoblastoid cell lines using the TRIzol LS reagent protocol 
(Invitrogen). Nonsense-mediated RNA decay was detected by sequencing on a 
3700ABI capillary sequencer complementary DNA from HTR2B Q20/Q20* 
heterozygotes. HT2B protein was measured in 12 Finnish Q20/Q20 homozygotes 
and 14 Finnish Q20/Q20* heterozygotes. Blots were probed with antisera raised 
against the amino-terminal (mouse monoclonal antibody; Novus Biologicals), 
internal (goat polyclonal antibody; Santa Cruz Biotechnology), or carboxy- 
terminal (rabbit polyclonal antibody; Santa Cruz Biotechnology) regions of the 
HT2B receptor, and GAPDH antibody (Millipore). Densitometry was performed 
using National Institutes of Health (NIH) ImageJ. Htr2b-/~ knockout mice were 
made in a pure 129Sv/PAS background and compared to 129/SvPAS control 
mice (8-10 weeks old) for four measures of response to novelty and for delay 
discounting. 


Full Methods and any associated references are available in the online version of 
the paper at www.nature.com/nature. 
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METHODS 


Human studies. Written informed consent was obtained from each participant. 
Protocols were approved by the Institutional Review Board (IRB) of the NIH and 
the National Institute of Mental Health (NIMH), by the Office for Protection 
from Research Risks (OPRR), Indiana University IRB, by the University of 
Helsinki Department of Psychiatry IRB, by the University of Helsinki Central 
Hospital IRB, the University of Turku Central Hospital IRB, and by the Ministry 
of Social Affairs and Health and the Ethics Committee of the National Public 
Health Institute of Finland. 

Animal studies. Mice were housed under controlled environmental conditions. 
Behavioural tests and animal care were conducted in accordance with standard 
ethical guidelines (NIH’s “Guide for the Care and Use of Laboratory animals”, 
and the European Communities Council European Communities Directive 86/ 
609 EEC). All experiments involving mice were approved by the Ile de France 
Regional Ethics Committee for Animal Experiments. 

Finnish violent offenders’ cohort and controls. Cases were 228 unrelated Finnish 
male violent offenders and arsonists (Supplementary Table 1) who, because of the 
extreme nature of their crimes, underwent forensic psychiatric examination at the 
time of their initial incarceration. They were studied as inpatients at the University 
of Helsinki’**’. These subjects were diagnosed with the Structural Clinical 
Interview for DSM (SCID) according to DSM-III-R criteria for ASPD, BPD and 
IED. Excluded were subjects with schizophrenia or a history of psychosis. Ninety- 
six cases were selected for resequencing from the larger Finnish case cohort, 
comprising 228 individuals with diagnoses of ASPD, BPD and IED, on the basis 
that they had the highest Brown-Goodwin Lifetime Aggression (BGLAS) scores**, 
with scores of 23.7 (s.d.+ 4.9) out of a theoretical maximum of 36. Controls 
(N = 295) were unrelated, nonimpulsive Finnish volunteers recruited by adver- 
tisements in local newspapers, paid for their participation and psychiatrically 
interviewed by trained psychiatrists. Cases and controls were independently 
blind-rated from interview data by two research psychiatrists under the super- 
vision of a senior research psychiatrist. Inter-rater reliability was high, and differ- 
ences were resolved by the senior psychiatrist. Controls were free of ASPD, BPD, 
IED, psychosis or schizophrenia but some had mood or anxiety disorders or 
alcohol use disorder (Supplementary Table 1). Ninety-six male controls free of 
Axis I and II diagnoses and matched for age were selected for sequencing for SNP 
discovery from a cohort of 295 controls. Controls had a BGLAS score of 8.1 
(s.d. + 4.9). 

A total of 89 pedigrees were collected. Family members were interviewed using 

the SCID and diagnosed using DSM-III-R criteria. DNA and data were available 
for 397 subjects in families. Genomic DNA was prepared from lymphoblastoid 
cell lines. 
Resequencing. For the exon-centric targeting of 14 candidate genes, we custom- 
designed or used Applied Biosystem oligonucleotide primers to amplify 108 
target regions that covered exons, flanking regions and ~800-1,000 bp of the 
upstream regions of 14 genes, for a total of 82 kb (Supplementary Table 2). 

DNA samples were individually quantified in three replicates by RT-PCR, 
using TaqMan RNase P Detection Reagent kits (FAM) and Roche human 
DNA standards, and were normalized to 10 ng pl’. Eight DNA pools (12 sub- 
jects per pool) were made with equal amounts of DNA from 96 Finnish cases and 
in parallel fashion eight pools were made from 96 Finnish controls. Average 
sequencing coverage per individual per nucleotide was 80X. 

For DNA amplification, DNA pools were amplified in 108 separate PCR reactions 
(Supplementary Methods). 

Before DNA sequencing, amplicon concentrations were normalized using 
SequalPrep Normalization Plate kits (Invitrogen). All amplicons from the same 
DNA pool were combined. The DNA was sheared by sonication and purified with 
QlAquick PCR purification kits (QIAGEN). Genomic DNA preparation kits and 
protocol (Illumina) were used to prepare sequencing libraries. 

Analysis of sequence data was carried out by calling sequences from image files 
with the Illumina Genome Analyser Pipeline and aligning them to human ref- 
erence sequences from NCBI build 36.3 using the Illumina Eland software. Each 
36-base read was uniquely mapped to the human reference genome. Sequence 
reads with more than two mismatches were excluded. Sequence reads with 
alternative alleles that did not exactly match the reference genome did uniquely 
map to the corresponding location in the reference sequence. Additional results 
are described in Supplementary Data. 

Capillary electrophoresis sequencing. nsSNPs were validated by Sanger sequen- 
cing using the BigDye Terminator Sequencing Mix (Applied Biosystems) and 
analysed on the Applied Biosystems 3730 DNA Analyser. Of 26 nsSNPs, 22 were 
validated, and overall 30/34 SNPs tested in this way were validated. 

Predicted functionality. Missense, nonsense and synonymous variants were 
predicted to be probably damaging or damaging for protein function via 
PolyPhen and SIFT amino acid substitution prediction methods. Four variants 
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(DRD1 S259Y, HTR2B R388W, HTR2B Q20* and TPH2 P206S—rs17110563) 
scored as damaging or intolerant by both methods were used in a global test of 
proportion of rare functional variants in cases (ASPD, BPD or IED) and controls. 
Genotypes of the four SNPs were collapsed so that an individual was coded as 1 if 
a rare allele was present and otherwise as 0. Frequencies of putatively functional 
variants were globally compared between cases and controls, with the null hypo- 
thesis being a lack of difference between cases and controls in the proportion 
carrying the putatively functional variants. A case-control association test was 
also performed for HTR2B Q20* alone. Pearson 7” test was used to test the null 
hypothesis. All analyses were conducted using JMP software v7.0 (SAS Institute). 
The criterion for statistical significance was set at 0.05. 

Genotyping. HTR2B Q20* was genotyped in 228 Finnish cases and 295 Finnish 
controls and in 89 pedigrees belonging to the Finnish cohort for a total of 352 
subjects. Taking into account the fact that some families had affected probands, 
we genotyped a total of 872 Finnish DNAs. In addition to the Finnish case/control 
and family data set and over 3,100 samples representing worldwide diversity, we 
also genotyped a total of 5,684 individuals belonging to either a Finnish family 
data set (N = 1,885), or to the Older Finnish Twin cohort (N = 2,388) and the 
FinnTwin16 and FinnTwin12 studies (N = 1,411), as described in Supplementary 
Methods. 

Genotyping of Q20* was performed with a custom 5’ exonuclease assay 

(Applied Biosystems 7900) using these primers and probes: forward primer, 
5'-AGAGTGTCTGAACTTCAAAGCACAA-3’; reverse primer, 5'’- TCCAGAC 
CAGTTAGAAGAGATAACGT-3’; probe 1, 5’-AGGTGCTCTGCAAAAT-3’; 
probe 2, 5’-AGGTGCTCTACAAAAT-3’. 
Ancestry informative markers. A panel of 186 ancestry informative markers 
were genotyped on 1536-SNP arrays (Illumina)’. No difference was detected 
between cases (ASPD, BPD and IED) and controls in proportions of ancestries. 
The pattern of measured ancestry for seven ancestry factors derived separately for 
each subject was compared between controls (N = 279) and cases (N = 220) with 
reference to the Human Genome Diversity Panel (HGDP) (1,051 DNAs repre- 
senting 51 populations worldwide). 

Finnish ancestry was measured using 177 ancestry informative markers in 29 
Q20* carriers, 580 other Finns, and 200 individuals representing 10 European 
populations in HGDP. Principal component analysis was performed with 
EIGENSTRAT. 

For HTR2B RNA and protein expression studies, total protein and RNA were 
extracted from lymphoblastoid cell lines using the TRIzol LS reagent protocol 
(Invitrogen). 

HTR2B cDNA sequencing for nonsense-mediated decay. Nonsense-mediated 
RNA decay was detected by sequencing cDNA from HTR2B Q20/Q20* hetero- 
zygotes on a 3700ABI capillary sequencer (Fig. 2 and Supplementary Methods). 
The sequences of the upstream and downstream oligonucleotides were as follows: 
5'-gagtgtttgecatgettaca-3’ and 3’ -accaggcaggacatagaaca-5’ (Supplementary Methods). 
HTR2B Q20 and Q20* transcripts were quantified by comparing the relative 
intensities of the Q20 and Q20* sequencing peaks within each heterozygous indi- 
vidual (Supplementary Methods). 

Western blots. HT2B protein was measured in 12 Finnish Q20/Q20 homozy- 
gotes and 14 Finnish Q20/Q20* heterozygotes. Western blots were prepared 
using 50 jg of protein per lane on a 10% Bis-Tris gel (Invitrogen). Separated 
proteins were transferred to nitrocellulose using the iBlot transfer system 
(NuPage; Invitrogen). Blots were probed with antisera raised against the 
amino-terminal (mouse monoclonal antibody; Novus Biologicals), internal (goat 
polyclonal antibody; Santa Cruz Biotechnology) or carboxy-terminal (rabbit 
polyclonal antibody; Santa Cruz Biotechnology) regions of the HT2B receptor, 
and GAPDH antibody (Millipore). 

Antibody binding was visualized on X-ray film (Kodak XAR) using chemi- 
luminescence (ECL Plus, GE Healthcare). Densitometry was performed using 
NIH ImageJ. Ratios between the 5-HT2B receptor and the housekeeping protein 
GAPDH were calculated to normalize 5-HT2B protein quantity. 
qPCR for HTR2B in human brain. qPCR for HTR2B expression in 13 human 
brain regions was determined by ABI Taqman gene expression assays 
(Hs01118766 and Hs00168362). B-actin was the internal control. 
Neuropsychological assessment. Neuropsychological assessment was conducted 
on both the combined FinnTwinl6 and FinnTwinl2 cohorts (described in 
Supplementary Methods) for measures of verbal intellectual ability, working 
memory and executive function. Working memory was assessed with the Digit 
Span Forward and Backward subtests of the Wechsler Memory Scale-Revised 
(WMS-R). We analysed the combined FinnTwin16 and FinnTwin12 data sets. A 
linear regression model was constructed using performance on the working 
memory test as the dependent variable and sex and genotype as independent 
variables. Sex was a significant predictor, so the sample was stratified into male 
and female. Male heterozygotes performed significantly worse on the Digit Span 
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Backward and Forward tests, and combined score (Supplementary Table 12 and 
Supplementary Fig. 12). All statistical analyses were conducted using Stata (ver- 
sion 11, Stata Corp, College Station, Texas, USA). The criterion for statistical 
significance was set at 0.05. Bonferroni correction for multiple testing was 
applied, as presented in Supplementary Table 12. 
Htr2b knockout mice. Htr2b~/~ knockout mice (50% males and 50% females) 
were made in a pure 129Sv/PAS background. Wild-type 129/SvPAS mice (8-10 
weeks old), bred in-house, were used as controls. 

Novelty seeking and impulsive behaviour in Htr2b~/~ knockout mice were 
investigated using five experimental measures: novelty-induced locomotion; 


locomotor reactivity in response to a dopamine D1 receptor agonist; exposure 
to a novel object; delay discounting; and novelty-suppressed feeding. Plasma 
testosterone levels were measured. 
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Epigenetic proteins are intently pursued targets in ligand discovery. So far, successful efforts have been limited to 
chromatin modifying enzymes, or so-called epigenetic ‘writers’ and ‘erasers’. Potent inhibitors of histone binding 
modules have not yet been described. Here we report a cell-permeable small molecule (JQ1) that binds competitively 
to acetyl-lysine recognition motifs, or bromodomains. High potency and specificity towards a subset of human 
bromodomains is explained by co-crystal structures with bromodomain and extra-terminal (BET) family member 
BRD4, revealing excellent shape complementarity with the acetyl-lysine binding cavity. Recurrent translocation of 
BRD4 is observed in a genetically-defined, incurable subtype of human squamous carcinoma. Competitive binding by 
JQI displaces the BRD4 fusion oncoprotein from chromatin, prompting squamous differentiation and specific 
antiproliferative effects in BRD4-dependent cell lines and patient-derived xenograft models. These data establish 
proof-of-concept for targeting protein-protein interactions of epigenetic ‘readers’, and provide a versatile chemical 
scaffold for the development of chemical probes more broadly throughout the bromodomain family. 


Gene regulation is fundamentally governed by reversible, non-covalent 
assembly of macromolecules’. Signal transduction to RNA polymerase 
requires higher-ordered protein complexes, spatially regulated by 
assembly factors capable of interpreting the post-translational modifica- 
tion states of chromatin’. Readers of epigenetic marks are structurally 
diverse proteins each possessing one or more evolutionarily conserved 
effector modules, which recognize covalent modifications of histone 
proteins or DNA. The &-N-acetylation of lysine residues (Kac) on 
histone tails is associated with an open chromatin architecture and 
transcriptional activation®. Context-specific molecular recognition of 
acetyl-lysine is principally mediated by bromodomains. 

Bromodomain-containing proteins are of substantial biological 
interest, as components of transcription factor complexes and deter- 
minants of epigenetic memory*. There are 41 diverse human proteins 
containing a total of 57 bromodomains. Despite large sequence var- 
iations, all bromodomain modules share a conserved fold comprising 
a left-handed bundle of four « helices (%z, 4, Op, O%c), linked by 
diverse loop regions (ZA and BC loops) that contribute to substrate 
specificity. Co-crystal structures with peptidic substrates showed that 
the acetyl-lysine is recognized by a central hydrophobic cavity and is 
anchored by a hydrogen bond with an asparagine residue present in 
most bromodomains®. The BET family (BRD2, BRD3, BRD4 and 
BRDT) shares a common domain architecture featuring two 
amino-terminal bromodomains that exhibit high levels of sequence 
conservation, and a more divergent carboxy-terminal recruitment 
domain (Supplementary Fig. 1)°. 

Recent research has established a compelling rationale for targeting 
BRD4 in cancer. BRD4 remains bound to transcriptional start sites of 
genes expressed during the M/G1 transition, influencing mitotic pro- 
gression*. BRD4 is also a critical mediator of transcriptional elongation, 


functioning to recruit the positive transcription elongation factor 
complex (P-TEFb)’*. Cyclin-dependent kinase-9, a core component 
of P-TEFb’, is a validated target in chronic lymphocytic leukaemia’, 
and has recently been linked to c-Myc-dependent transcription’. 
Thus, BRD4 recruits P-TEFb to mitotic chromosomes resulting in 
increased expression of growth-promoting genes”. 

Importantly, BRD4 has recently been identified as a component of 
a recurrent t(15;19) chromosomal translocation in an aggressive form 
of human squamous carcinoma‘*”*. Such translocations express the 
tandem N-terminal bromodomains of BRD4 as an in-frame chimaera 
with the NUT (nuclear protein in testis) protein, genetically defining 
the so-called NUT midline carcinoma (NMC). Functional studies in 
patient-derived NMC cell lines have validated the essential role of the 
BRD4-NUT oncoprotein in maintaining the characteristic prolifera- 
tion advantage and differentiation block of this uniformly fatal malig- 
nancy’’. Notably, RNA silencing of BRD4-NUT arrests proliferation 
and prompts terminal squamous differentiation. These observations 
underscore the broad utility and immediate therapeutic potential of a 
direct-acting inhibitor of human bromodomain proteins. 


A selective and potent inhibitor of BET family 
bromodomains 


A major collaborative focus of our research groups concerns the develop- 
ment of chemical probes’*”’ and the optimization of therapeutic leads 
for the translation of small-molecule modulators of epigenetic targets 
as cancer therapeutics. Motivated by the above rationale, we have 
developed biochemical platforms for the identification of new inhibitors 
of bromodomain isoforms using high-throughput screening, as well as 
the annotation of putative ligands emerging from collaborative and 
published research. In the course of these studies, we learned of an 
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observation by Mitsubishi Pharmaceuticals that simple thienodiazepines 
possessed binding activity for BRD4 (ref. 20). Previous research from this 
group indicates that these compounds emerged from anti-inflammatory 
phenotypic studies, such as inhibition of CD28 co-stimulation as a 
means of treating autoimmune diseases”!”’. A rich literature has estab- 
lished the synthetic accessibility and favourable pharmacological prop- 
erties of this privileged class of drug-like small molecules”. Indeed, the 
core scaffold described appears in FDA-approved substances such as 
alprazolam and triazolam. 

Inferring structure-activity relationships also derived from molecular 
modelling of candidate ligands within the binding pocket of the apo 
crystal structure of the first bromodomain of BRD4 (hereafter referred 
to as BRD4(1); Protein Data Bank code 2OSS), we designed a prototype 
ligand, JQ1 (Fig. la). JQ1 is a novel thieno-triazolo-1,4-diazepine, 
possessing an appended, bulky ¢-butyl ester functional group at C6 in 
order to (1) allow for additional pendant group diversity, as needed, 
and (2) to mitigate binding to the central benzodiazepine receptor as 
predicted by published structure-activity relationships”. We first estab- 
lished a high-yielding, seven-step synthetic route to access racemic JQ1 
(hereafter referred to as JQ1) and derivatives (scheme 1 in Supplementary 


Methods). We have also identified a route to synthesize each enantiomer, 
(+)-JQ1 and (—)-JQ1 (scheme 2 in Supplementary Methods). 

To establish a biochemical platform for comprehensive selectivity 
screening, all human bromodomains were subcloned into bacterial 
expression vectors. Testing of an average of 15 expression constructs 
per bromodomain resulted in the identification of 37 expression systems 
that yielded soluble protein suitable for specificity screening and covered 
all bromodomain subfamilies (Supplementary Table 1). Because the 
specific substrates of most bromodomains are unknown, a general bind- 
ing assay based on differential scanning fluorimetry (DSF) was imple- 
mented”. Binding of (+)-JQ1 significantly increased the thermal 
stability of all bromodomains of the BET family (Fig. 1b and Sup- 
plementary Table 2) with AT mo?’ values between 4.2°C (BRDT(1)) 
and 10.1°C (BRD4(1)). No significant stability shifts were detected 
for bromodomains outside the BET family, indicating that this ligand 
is highly selective. In contrast, the stereoisomer (—)-JQ1 showed no 
significant interaction with any bromodomain present in our panel. 

Within a family of proteins a linear correlation between DSF AT,,°°* 
values and binding constants has been observed, with temperature 
shifts larger than 6 °C corresponding to compounds with nanomolar 
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Figure 1 | Structure and selectivity of JQ1. a, Structure of the two JQ1 
stereoisomers. The stereocentre at C6 is indicated by an asterisk. b, Assessment 
of inhibitor selectivity using differential scanning fluorimetry (DSF). Shown are 
averaged temperature shifts (AT,,°"’) in degrees Celsius upon binding of 10 1M 
(+)-JQ1. The temperature shifts are represented by spheres as indicated in the 
inset. Screened bromodomains are labelled and proteins not included in the 
selectivity panel are shown in grey. (—)-JQ1 did not reveal any significant 
temperature shifts to any of the screened bromodomains. c, Isothermal titration 
calorimetry (ITC). Differential power (AP) data time course of raw injection 
heats are shown for a blank titration of BRD4(1) into buffer (A), and reverse 
titrations using the inactive isomer (—)-JQ1 (B) and the active isomer (+)-JQ1 
(C). The inset shows normalized binding enthalpies corrected for the heat of 
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dilution as a function of binding site saturation (symbols as indicated in the 
inset). Solid lines represent a nonlinear least squares fit using a single-site 
binding model. d, Thermal shifts (AT m°?’) show good correlation to 
dissociation constants (Kg) determined by ITC for the BET bromodomains. 
The dotted red line represents a least squares fit with an R of 96%. The AT,,°°* 
data represent the mean = s.d. (n = 3). Error for ITC data was based on 
deviations to least squares fit described in c. e, Competitive displacement of a 
histone peptide from human bromodomains is exhibited by JQ1 using a bead- 
based proximity assay. Alpha-screen titrations monitoring the displacement of 
a tetra-acetylated histone H4 peptide by JQ1 isomers using the bromodomains 
BRD4(1), BRD4(2) or of an acetylated H3 peptide using CREBBP are shown. 
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dissociation constants (Kq)**”°. Because the sensitivity of this assay 
may vary between different protein families, isothermal titration 
calorimetry (ITC) was used to determine binding constants precisely. 
Enantiomerically pure (+)-JQ1 bound with a Kg of about 50 nM and 
90 nM to the first and second bromodomains of BRD4, respectively 
(Fig. lc and Supplementary Table 3). Comparable binding to both 
domains of BRD3 was observed, whereas the first bromodomains of 
BRDT and BRD2 revealed about threefold weaker binding. Affinities 
determined by ITC and AT,,°°* values showed very good correlation 
(Fig. 1d). Importantly, (+)-JQ1 showed no detectable binding to bro- 
modomains that exhibited minimal temperature shifts, such as 
WDR9(2) and CREBBP. 

To assess whether (+)-JQ1 binding was competitive with acetyl- 
lysine, we adapted a luminescence proximity homogeneous assay 
(alpha-screen)”” to the BET bromodomains. Binding of a tetra- 
acetylated histone H4 peptide to BRD4 was strongly inhibited by 
(+)-JQ1, with half-maximum inhibitory concentration (ICs) values 
of 77nM and 33 nM for the first and second bromodomain, respec- 
tively (Fig. le). The ICs for the (—)-JQ1 stereoisomer against BRD4(1) 
and for (+)-JQ1 against CREBBP were both estimated to be above 
10,000 nM (Fig. le). Thus, (+)-JQ1 represents a potent, highly specific 
and Kac-competitive inhibitor for the BET family of bromodomains. 


(+)-JQI binds to the Kac binding site of BET 
bromodomains 


To establish the binding mode of JQ1 we determined co-crystal struc- 
tures using racemic material and purified, recombinant BRD4(1) and 
BRD2(2) (for data collection and refinement statistics see Supplemen- 
tary Table 4). The determined high-resolution structures revealed that 
only the (+)-JQ1 enantiomer bound directly into the Kac binding site 
(Figs 2 and 3a, b). Similar to interactions observed in acetyl-lysine 
complexes”*, the triazole ring formed a hydrogen bond with the evo- 
lutionarily conserved asparagine (Asn 140 in BRD4(1) and Asn 429 in 
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Figure 2 | Characterization of BET complexes with (+)-JQ1. 

a, Superimposition of the mouse BRD4(1)-H3K14ac peptide complex”® with 
the human BRD4(1)-(+)-JQ1 complex structure. The hydrogen bond formed 
to the conserved asparagine (N140) in the peptide complex is shown as yellow 
dots. b, 2F, — F- map of (+)-JQ1 in complex with BRD4(1) contoured at 20. 
c, Electrostatic surface of BRD4(1) in complex with (+)-JQ1. The ligand is 
shown as a Corey-Pauling—Koltun (CPK) model demonstrating the excellent 
shape complementarity with the protein acetylated lysine receptor site. 

d, Ribbon diagram of the complex of human BRD4(1) with (+)-JQ1 in CPK 
representation. The main secondary structural elements and the conserved 
active site asparagine side chain (N140) are labelled. 
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BRD2(2); Fig. 2a). The inhibitor showed an extraordinary shape com- 
plementarity with the Kac binding site, occupying the entire binding 
pocket (Fig. 2c, d). In both complexes, ligand binding was stabilized by 
hydrophobic interactions with conserved BET residues in the ZA- and 
BC-loop regions (Fig. 3a, b). Structural and sequence comparison 
showed high conservation of the BET Kac binding pocket, but 
revealed a number of differences in loop regions lining the binding 
cavity that could be explored for future development of isoform- 
specific inhibitors (Fig. 3a—c). 

Docking of either isomer of JQ1 to BRD4(1) resulted in excellent fit of 
(+)-JQ1 in a position of perfect overlap to the crystallographically 
determined binding mode, whereas (—)-JQ1 resulted in an energetically 
unfavourable conformation with significant distortion of the diazepine 
ring system due to steric clashes with residues of the ZA-loop region 
(Fig. 3d). To explore the dynamic features of BET bromodomains, we 
carried out 20-ns molecular dynamics simulations of BRD4(1) in the 
absence and presence of (+)-JQ1. The simulations revealed little 
displacement of the protein helices, but the loop regions surround- 
ing the acetyl-lysine binding site showed significant conformational 
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Figure 3 | Binding site comparison between N- and C-terminal 
bromodomains in complex with (+)-JQ1. a, The acetyl-lysine binding pocket 
of BRD4(1) is shown as a semi-transparent surface with contact residues 
labelled and depicted in stick representation. Carbon atoms in (+)-JQ1 are 
coloured yellow to distinguish them from protein residues. Distinguishing 
surface residues are shown in red; the family conserved asparagine is shown in 
blue. b, The acetyl-lysine binding pocket of BRD2(2) is shown in identical 
representation and orientation as described in a. c, Protein sequence alignment 
of the human BET sub-family highlighting conserved (red) and similar (yellow) 
residues. Major bromodomain structural elements are shown. The side-chain 
contacts with (+)-JQ1 are annotated with a black star. Contacts which differ 
between the N- and C-terminal BET bromodomains (red star) are highlighted. 
The conserved asparagine is indicated by a blue star. d, Models of (+)-JQ1 (in 
yellow) and (—)-JQ1 (in green) docked into the BRD4(1) binding site. The 
steric clashes of the (—)-JQ1 stereoisomer with Leu 92 and Leu 94 are 
highlighted in red. e, Molecular dynamics simulation demonstrating the 
flexibility of the ZA and BC loops of the BRD4(1) apo-structure. Shown is the 
backbone of BRD4(1) during a 20-ns simulation as snapshots separated by 1-ns 
intervals. The different structures are distinguished by colours changing from 
blue to green as indicated in the inset. f, Molecular dynamics simulation of the 
BRD4(1)-(+)-JQ1 complex depicted in 1-ns snapshots as described in e. 
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flexibility. Furthermore, these loops were much more flexible in the 
absence (Fig. 3e) than in the presence (Fig. 3f) of the inhibitor, indicating 
that (+)-JQ1 stabilized the Kac binding cavity. In all cases, molecular 
dynamics simulation energies converged (Supplementary Fig. 2). 


JQI1 displaces BRD4 from nuclear chromatin in cells 
To establish whether JQ1 binds bromodomains competitively with 
chromatin in a cellular environment, we performed fluorescence 
recovery after photobleaching (FRAP) experiments. Previous research 
has demonstrated the utility of FRAP in assessing the pace of lateral 
redistribution of human bromodomains’’”’. Human osteosarcoma 
cells (U2OS) transfected with GFP-BRD4 show a time-dependent 
recovery of fluorescence intensity (Fig. 4a, b). In the presence of JQ1 
(500 nM), the observed recovery is immediate, indicating displaced 
and freely diffusing nuclear BRD4 (Fig. 4a, b). Cellular FRAP studies 
confirmed that the effects on the mobile fraction of BRD4 are limited to 
the biochemically active (+)-JQ1 stereoisomer (Supplementary Fig. 3). 
Having demonstrated potent, selective binding to BRD4 in homo- 
geneous and cell-based assays, we became interested to explore the 
effects of JQ1 on disease-relevant phenotypes. Previous studies have 
established that the pathogenic BRD4-NUT fusion protein arising 
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from the t(15;19) translocation in NMC binds avidly to discrete foci 
of acetylated chromatin, conferring a proliferative advantage and dif- 
ferentiation block'’. Using FRAP, we assessed the ability of JQ1 to 
target directly the BRD4-NUT oncoprotein. Compared to a vehicle 
control, JQ1 (500 nM) markedly accelerated time to half fluorescence 
recovery in photobleached regions of cells transfected with GFP- 
BRD4-NUT (Fig. 4c, d). Notably, no effect was observed on re- 
distribution of GFP-NUT (Supplementary Fig. 3). These data are 
consistent with competitive binding of JQ1 to BRD4 in cultured cells. 


JQ1 induces differentiation and growth arrest in NMC 

Direct inhibition of gene products expressed from recurrent, onco- 
genic translocations is a validated therapeutic approach in cancer*™”’. 
We thus endeavoured to establish the consequences of competitive 
inhibition of BRD4-NUT in NMC. A characteristic feature of NMC is 
the appearance of discrete nuclear speckles of the BRD4—NUT onco- 
protein by NUT-directed immunohistochemistry”’. Treatment of the 
patient-derived 797 NMC cell line for 48 h with JQ1 (500 nM) effaces 
nuclear foci, producing diffuse nuclear NUT staining by immuno- 
histochemistry (Supplementary Fig. 3e). In a dose- and time-dependent 
manner, JQ1 provokes a differentiation phenotype in NMC cell lines, 
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Figure 4 | JQ1 binds BRD4 competitively with chromatin resulting in 
differentiation and growth arrest of NMC cells. a, Fluorescence recovery 
after photobleaching (FRAP) of GFP-BRD4 demonstrates enhanced recovery 
in the presence of JQ1. Nuclei are false-coloured in proportion to fluorescence 
intensity. White circles indicate target regions of photobleaching. b, ¢, JQ1 
accelerates fluorescence recovery in FRAP experiments performed with 
transfected GFP-BRD4 (b) and GFP-BRD4-NUT (c). d, Quantitative 
comparison of time to half-maximal fluorescence recovery for FRAP studies 
(b, c, Supplementary Fig. 3a). Data represent the mean + s.d. (m = 5), and are 
annotated with P-values as obtained from a two-tailed t-test. NS, not 
significant. e, Differentiation of NMC cells by JQ1 (500 nM) is prompt and 
characterized by a marked increase in cytokeratin expression (mouse anti- 
cytokeratin clone AE1/AE3; X10, scale bar is 50 um). f, Comparative gene 
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expression studies of (+)-JQ1 (red; 250 nM, 48 h) versus (—)-JQ1 (grey; 

250 nM, 48 h) and vehicle (black) confirm squamous differentiation. Data 
represent the mean = s.d. (m = 3), and are annotated with P-values as obtained 
from a two-tailed t-test. g, Growth effects of BRD4 inhibition on BRD4-NUT- 
dependent cell lines. Cells were incubated with (+)-JQ1 (red circles) or (—)- 
JQ] (black circles) and monitored for proliferation after 72 h. (+)-JQ1 uniquely 
attenuates proliferation by NMC cell lines. Data are presented as mean + s.d. 
(n = 3). Curve fit was calculated by logistical regression. h, Flow cytometry for 
DNA content in NMC 797 cells. (+)-JQ1 (250 nM, 48h) induces a G1 arrest 
compared to (—)-JQ1 (250 nM) and vehicle control. Data are presented as a 
histogram of nuclear fluorescence intensity. i, Flow cytometric analysis of NMC 
797 squamous carcinoma cells treated with vehicle, JQ1 or staurosporine 
(STA), as indicated. AV, annexin-V; PI, propidium iodide. 
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featuring cell spreading and flattening, open chromatin and striking 
spindle morphology (Fig. 4e and Supplementary Fig. 4). Differentiation 
is prompt (<24h) and characterized by marked changes in cell shape 
accompanied by markedly augmented expression of cytokeratin, a hall- 
mark of squamous differentiation (Fig. 4e). After 7 days in culture with 
submicromolar exposures to JQ1, terminal differentiation is observed. 
In this manner, JQ1 phenocopies the morphological changes and 
increased keratin expression observed with BRD4—NUT silencing by 
RNA interference (Supplementary Fig. 5)'”. Corroborating these studies, 
expression analysis of three canonical squamous tissue genes by RT- 
PCR identified marked (30-fold) induction of keratin-14 by (+)-JQ1 in 
NMC 797 cells (Fig. 4f). The modest induction of keratin-10 without 
affecting epidermal transglutaminase (TGM1) may indicate differenti- 
ation towards thoracic squamous epithelium, consistent with the 
mediastinal primary tumour from which NMC 797 cells derive”. 
Induction of differentiation with intense keratin staining is progressive 
over 72 h, as determined by quantitative immunohistochemistry analysis 
(Supplementary Fig. 6). Supporting an on-target mechanism-of-action, 
the (—)-JQ1 enantiomer is comparatively inactive in NMC, and a non- 
BRD4-dependent squamous carcinoma cell line (TE10) fails to exhibit 
differentiation effects of active JQ1 (Supplementary Fig. 4c). 

In BRD4-dependent NMC cells, differentiation is expectedly 
accompanied by growth arrest, as demonstrated by reduced Ki67 
staining (Supplementary Fig. 7), sustained inhibition of proliferation 
(Fig. 4g; Supplementary Fig. 8) and G1 cell-cycle arrest (Fig. 4h). To 
understand further the observed G1 arrest and to confirm an effect of 
JQ1 on known BRD4-dependent genes, we performed quantitative 
RT-PCR for RAD21 and RAN (ref. 4). (+)-JQ1 potently decreased 
expression of both BRD4 target genes, whereas (—)-JQ1 had no effect 
(Fig. 4f). Early and late apoptosis were assessed with annexin-V and 
propidium iodide staining to ascertain whether the antiproliferative 
effect and irreversible differentiation was accompanied by cell death. 
Indeed, JQ1 induces immediate and progressive apoptosis in BRD4- 
dependent human carcinoma cells, without triggering significant 
growth arrest or cell death in cell lines lacking the BRD-NUT fusion 
(Supplementary Figs 8 and 9). 


Antitumour efficacy of JQ1 in xenograft models of NMC 


To determine whether JQ1 could attenuate the growth of BRD4- 
dependent carcinoma as a single agent in vivo, we developed three 
xenograft models of NMC in mice. First, short-term treatment studies 
were performed in NMC 797 xenografts with positron-emission 
tomography (PET) imaging of '*F-fluorodeoxyglucose (FDG) uptake 
as a primary endpoint to explore whether activity of JQ1 could be 
demonstrated by non-invasive imaging. Matched cohorts of mice 
with established tumours were randomized to treatment with JQ1 
(50 mgkg ') or vehicle, administered by daily intraperitoneal injec- 
tion. Before randomization, and after 4 days of therapy, mice were 
evaluated by FDG-PET imaging. A marked reduction in FDG uptake 
was observed with JQ1 treatment, whereas vehicle-treated mice 
demonstrated progressive disease (Fig. 5a). Tumour-volume measure- 
ments confirmed a reduction in tumour growth with JQ1 treatment 
(Fig. 5b and Supplementary Fig. 10). JQ1 was well tolerated at this dose 
and schedule without overt signs of toxicity or weight loss (Supplemen- 
tary Fig. 10b). 

To confirm that the antineoplastic effect observed with JQ1 treat- 
ment was associated with target engagement, sectioned tumour tissue 
was examined for the BRD4-NUT oncoprotein. As presented in Sup- 
plementary Fig. 11, JQ1 treatment resulted in effacement of NUT nuclear 
speckles, consistent with competitive binding to nuclear chromatin. Cell 
spreading and increased keratin expression confirmed induction of 
squamous differentiation (Fig. 5c). Decreased nuclear Ki67 and increased 
TUNEL staining in treated animals confirmed an ongoing antiprolifera- 
tive, pro-apoptotic effect (Supplementary Fig. 11). To quantify the phar- 
macodynamic biomarker of tumour keratin expression, we established 
protocols for automated immunohistochemistry image acquisition 
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and analysis. Paired samples from treated and untreated animals were 
prepared and analysed using standardized protocols and commercially 
available software (ImageScope; Aperio Technologies), demonstrating 
that JQ1 induced strong (grade 3+) keratin expression in NMC 797 
xenografts (Supplementary Fig. 12). 

In parallel with these studies, we had occasion to care for a 29-year- 
old patient with widely metastatic BRD4—NUT-positive NMC arising 
from the mediastinum. With the goal of developing a more clinically 
relevant disease model, we established short-term cultures (11060 
cells) using discarded clinical material obtained from pleural fluid 
draining from a palliative chest tube. As presented in Supplemen- 
tary Fig. 13, in vitro studies confirmed the stereospecific, potent effect 
of (+)-JQ1 on cellular viability (IC5) = 4nM), growth and cell cycle 
progression. Four animals engrafted with patient-derived tumour 
material developed measurable disease, which was strongly FDG-avid 
by PET imaging (Fig. 5d). Animals were randomly assigned to vehicle 
or (+)-JQ1 treatment. Before treatment and after 4 days of therapy, 
mice were evaluated by PET imaging. A marked response in FDG 
uptake was observed with (+)-JQ1 treatment, whereas vehicle-treated 
animals demonstrated progressive disease (Fig. 5e). Tumour material 
prepared for quantitative immunohistochemistry analysis demonstrated 
induction of keratin expression after (+)-JQ1 treatment (Fig. 5f, g and 
Supplementary Fig. 14) in this minimally passaged NMC xenograft 
model. 

To confirm the translational potential of direct-acting BRD4 inhibi- 
tion in NMC, we further adapted the patient-derived 11060 cells to 
expansion in vivo, and performed definitive efficacy studies. Marked 
tumour regression and prolonged overall survival were observed, after 
only 18 days of well-tolerated therapy with (+)-JQ1 (Fig. 5h, i). These 
results were recapitulated in a third NMC xenograft model, using 
Per403 cells (Fig. 5j, k and Supplementary Fig. 15). Together, these 
data establish in vivo proof-of-concept for targeting BRD4 with JQ] in 
NMC. 


Discussion 


Across the complex landscape of the cancer genome, recurrent chromo- 
somal rearrangements comprise a compelling subset of clear, genetic 
targets in cancer. As evidenced by the successful development of first- 
and second-generation kinase inhibitors targeting BCR-ABL in chronic 
myelogenous leukaemia, well-characterized probe compounds**”, 
high-resolution crystallographic data’®, translational research studies”, 
and informative murine models*’, where available, provide an optimal 
platform for ligand discovery and target validation. Herein, we provide 
comparable evidence supporting the BRD4—NUT fusion as a thera- 
peutic target in an incurable, genetically-defined human squamous 
carcinoma, using a novel BRD4-directed small-molecule inhibitor. 

Beyond NUT-midline carcinoma, BET-family bromodomains 
contribute to other neoplastic and non-neoplastic diseases. BRD4 
targets the P-TEFb complex to mitotic chromosomes, resulting in 
the expression of growth-promoting genes such as c-Myc'*™ and the 
well-established cancer target Aurora B*’. BET family members have 
been recognized as essential genes for the replication of viruses***' and in 
mediating inflammatory responses”. Thus, the availability of (+)-JQ1 
will prompt informative research broadly in developmental and disease 
biology. JQ1 possesses many desirable qualities of a chemical probe, 
such as high target potency in homogeneous and cellular assays, a 
well-characterized profile of selectivity, synthetic accessibility and herein 
proven utility in experimental biology'*’”. We have also found JQ1 to 
exhibit few off-target effects on cellular receptors and excellent pharma- 
cokinetic properties including 49% oral bioavailability (Supplementary 
Figs 16 and 17 and Supplementary Table 5), establishing the plausibility 
of developing drug-like derivatives for therapeutic application. 

The discovery and optimization of small-molecule inhibitors of 
epigenetic targets is a major focus of current biomedical research”. 
We sought to meet the challenge of developing potent, selective inhibitors 
of epigenetic readers. Here we present a first, thoroughly characterized 
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Figure 5 | JQ1 promotes differentiation, tumour regression and prolonged 
survival in murine models of NMC. a, PET imaging of murine NMC 797 
xenografts. FDG uptake in xenograft tumours is reduced by 50 mgkg * JQ1 
treatment compared to vehicle control. Arrows indicate the anatomical 
location of tumour xenograft. b, Tumour volume is reduced in mice with 
established disease (NMC 797 xenografts) treated daily with 50 mgkg * JQ1 
compared to vehicle control. A significant response to therapy is observed by a 
two-tailed t-test at 14 days (P = 0.039). Data represent the mean + s.d. (n = 7). 
c, Histopathological analysis of NMC 797 tumours excised from animals 
treated with JQ1 reveals induction of keratin expression (mouse anti- 
cytokeratin clone AE1/AE3, X40) and impaired proliferation (Ki67, <40), as 
compared to vehicle-treated animals (scale bar is 20 j1m). d, Viability of patient- 
derived NMC 11060 xenografts was confirmed by PET imaging. Arrow 
indicates the anatomical location of tumour xenograft. e, Therapeutic response 
of primary 11060 NMC xenografts to (+)-JQ1 (50 mgkg | daily for 4 days) 
was demonstrated by PET imaging. Integrated signal encompassed within the 


inhibitor of the BET-family of bromodomains. The approach outlined 
herein further establishes the feasibility of abrogating protein-protein 
interactions with small molecules, and targeting additional epigenetic 
readers for ligand discovery. 


METHODS SUMMARY 


The inhibitor JQ1 was synthesized in both racemic and enantiomerically pure 
format using the synthetic route outlined in scheme 1 and scheme 2 (Supplemen- 
tary Methods) and its structure was fully characterized. Human bromodomains 
were expressed in bacteria as His-tagged proteins and were purified by nickel- 
affinity and gel-filtration chromatography. Protein integrity was assessed by 
SDS-PAGE and electro-spray mass spectrometry on an Agilent 1100 Series 
LC/MSD TOE. All crystallizations were carried out at 4°C using the sitting-drop 
vapour-diffusion method. X-ray diffraction data were collected at the Swiss Light 
source beamline X10SA, or using a Rigaku FR-E generator. Structures were 
determined by molecular replacement. Isothermal titration calorimetry experi- 
ments were performed at 15°C on a VP-ITC titration microcalorimeter 
(MicroCal). Thermal melting experiments were carried out on an Mx3005p 
RT-PCR machine (Stratagene) using SYPRO Orange as a fluorescence probe. 
Dose-ranging small-molecule studies of proliferation were performed in white, 
384-well plates (Corning) in DMEM media containing 10% FBS. Compounds 
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Day of treatment Day of treatment 


tumour volume is presented as the per cent injected dose per gram (% ID per g). 
f, Histopathological analysis of primary NMC 11060 tumours excised from 
animals treated with (+)-JQ1 reveals induction of keratin expression (mouse 
anti-cytokeratin clone AE1/AE3, 20; scale bar is 20 zm), compared to vehicle- 
treated animals. Quantitative analysis of keratin induction was performed using 
image masking (f, right panel) and pixel positivity analysis (g). A significant 
response to therapy is observed by a two-tailed t-test (P = 0.0001). Data 
represent the mean = s.d. of three independent wide microscopic fields. 
Comparative images of stained excised tumours and quantitative masks are 
provided in Supplementary Fig. 14. h-k, (+)-JQ1 (red circles and lines; 50 mg 
kg ' daily for 18 days) produces a decrease in tumour volume (h, j) and 
promotes prolonged survival (i, k) in patient-derived 11060 (h, i) and Per403 
(j, kK) NMC xenograft models (n = 10 in all groups). A significant response to 
therapy is observed for tumour volume by a two-tailed t-test (P < 0.0001) and 
for overall survival by a log-rank test (P < 0.0001). Black circles and lines, 
vehicle. 


were delivered with a JANUS pin-transfer robot and proliferation measurements 
were made on an Envision multilabel plate-reader (PerkinElmer). Murine xenografts 
were established by injecting NMC cells in 30% Matrigel (BD Biosciences) into the 
flank of 6-week-old female NCr nude mice (Charles River Laboratories). Tumour 
measurements were assessed by caliper measurements, and volume was calculated 
using the formula Vol = 0.5 X L x W”. All mice were humanely killed, and tumours 
were fixed in 10% formalin for histopathological examination. Quantitative 
immunohistochemistry was performed using the Aperio Digital Pathology 
Environment (Aperio Technologies) at the DF/HCC Core Laboratory at the 
Brigham and Women’s Hospital. 
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Driving the cell cycle with a minimal 
CDK control network 


Damien Coudreuse! & Paul Nurse! 


Control of eukaryotic cell proliferation involves an extended regulatory network, the complexity of which has made it 
difficult to understand the basic principles of the cell cycle. To investigate the core engine of the mitotic cycle we have 
generated a minimal control network in fission yeast that efficiently sustains cellular reproduction. Here we demonstrate 
that orderly progression through the major events of the cell cycle can be driven by oscillation of an engineered 
monomolecular cyclin-dependent protein kinase (CDK) module lacking much of the canonical regulation. We show 
further that the CDK oscillator acts as the primary organizer of the cell cycle, imposing timing and directionality to a 
system of two CDK activity thresholds that define independent cell cycle phases. We propose that this simple core 
architecture forms the basic control of the eukaryotic cell cycle. 


Progression through the eukaryotic cell cycle is driven by CDKs, 
which form bipartite complexes with different cyclins’. Changes in 
activity of these complexes depend on oscillations in levels of the 
cyclins, the synthesis and degradation of which are regulated through- 
out the cell cycle, while combinatorial associations of CDKs and 
cyclins are thought to generate the distinct substrate specificities 
required to bring about the different cell cycle transitions*’. Control 
of the respective expression and subcellular localization of central 
CDK machinery subunits constitutes a primary layer of cell cycle 
regulation*. In addition, CDK activity is modulated by specific inhi- 
bitors and by changes in phosphorylation of the catalytic subunit in 
response to inputs such as nutrient availability, cell size, and activa- 
tion of checkpoint mechanisms’. Some of these controls form feed- 
back loops that generate sharp changes in activity with hysteretic 
properties, contributing to the unidirectionality of the cell cycle’. 
Integration of all these parameters ensures orderly progression 
through the mitotic cycle and appropriate responses to perturbations. 

The complexity of eukaryotic cell cycle control has made it difficult 
to fully understand its basic principles, as demonstrated by the plas- 
ticity reported for certain key cell cycle effectors'”’. To investigate 
the core engine of the mitotic cycle, we have generated a minimal 
control network in the fission yeast Schizosaccharomyces pombe. We 
show that oscillation of a single monomolecular CDK module in the 
absence of many of the known regulatory inputs and feedbacks is 
sufficient to sequentially trigger the major cell cycle events. We 
demonstrate further that the core cycle can be built on a circuit of 
two CDK activity thresholds defining independent states with no 
inherent directionality, upon which sequence and timing are imposed 
by a single CDK oscillator. 


A minimal cell cycle in fission yeast 
The fission yeast cell cycle is controlled by a single CDK, Cdc2, 
required for both the G1/S and G2/M transitions’*’. DNA replication 
and mitosis are triggered by association of Cdc2 with the B-type 
cyclins Cig2 and Cdc13'®, respectively, with two additional cyclins, 
Cig1 and Pucl, having more minor roles in G1'’”?*. In addition, 
Cdc13/Cdc2 activity in G2 blocks reinitiation of DNA replication”®”’. 
To simplify the cell cycle control machinery, a cassette expressing a 
fusion of cdc13 and cdc2 under the control of the cdc13 regulatory 


elements was integrated into the genome (cdc13-L-cdc2; Fig. 1a). This 
minimal CDK system differs from that operative in wild-type cells in 
several ways: (1) the regulatory and catalytic subunits are subject to 
the same transcriptional, translational and degradation programs, are 
always present in a 1:1 ratio, and always co-localize; (2) the rise in 
CDK activity is not triggered by a separate cyclin concentration 
threshold; (3) fusing the kinase with a specific cyclin is likely to pre- 
vent association with other cyclins and renders modulators of binding 
between the two subunits irrelevant. The fusion protein was active 
and additive to the function of the endogenous CDK machinery 
(Supplementary Fig. 1). We next deleted the genomic copies of cdc2 


a b 123 
Pede13 4 1446 1 894 —— <F 
cde13 LY ede2 |] cde13. 3’ UTR = Anti-Cdc13 
= —————s - <E 


Anti-tubulin qu quem 


1: Wild type 
2: cdc13-L-cdce2 A2A13 
3: cdce13-L-cdc2 A2A13ACCP 


Figure 1 | A Cdc13-L-Cdc2 fusion in fission yeast. a, Schematic 
representation of the cdc13-L-cdc2 fusion. Numbers are open reading frame 
coordinates. L, linker. Pcdc13, cdc13 promoter and 5’ untranslated region 
(UTR). b-d, Labels are indicated in box. b, Western blots probed for Cdc13 and 
tubulin. E, endogenous Cdc13 (56 kDa). F, Cdc13-L-Cdc2 (91 kDa). A single 
band was also observed using an anti-Cdc2 antibody (data not shown). 

c, Blankophor staining of exponentially growing cells. Scale bar, 10 zm. d, DNA 
content analysis of cells in c (see Methods). e, Exponentially growing cdc13-L- 
cdc2-YFP A2A13ACCP cells at 25 °C. YFP and DNA imaging (Hoechst) of 
individual cells arranged according to their cell cycle stage. Dashed lines show 
cell outlines. Scale bar, 5 um. 
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and cdc13 (42413). In this background, Cdc13-L-Cdec2 was detected 
as a single full-length protein (Fig. 1b) and cdc13-L-cdc2 A2A 13 cells 
were almost identical to wild type, having a normal generation time 
and only a small increase in cell length at division (Fig. 1c and 
Table 1). DNA content analysis showed the same profile as wild type 
(Fig. 1d and Methods), indicating that the relative durations of the 
different cell cycle phases were maintained. 

To demonstrate that this minimal machinery autonomously drives 
cell cycle progression, two alterations were introduced within the 
fusion protein (Supplementary Fig. 2a): (1) Cdc13(C379Y) (here 
referred to as Cdc13,,), generating a temperature-sensitive protein"; 
and (2) Cdc2(F84G) (here referred to as Cdc2,,; as, analogue sensi- 
tive), rendering the kinase sensitive to chemical inhibition”. 
Treatment with inhibitor (ATP analogue NmPP1) or shift to restrictive 
temperature arrested cdc13,,-L-cdc2,, A2A 13 cells in G2 (Supplemen- 
tary Fig. 2b). Next, we additionally deleted cig2, which encodes the 
major S-phase cyclin. Absence of Cig2 had no effect on the timing of 
DNA replication (Supplementary Fig. 3a), but impairing function of 
either moiety of the fusion protein in G1 cells delayed S-phase onset 
(Supplementary Fig. 3b-e). These data show that both kinase and 
cyclin moieties of the CDK module are required to trigger the G1/S 
and G2/M transitions. 

The fission yeast genome contains 13 cyclin-like genes. Protein 
sequence comparisons showed that Cigl, Cig2 and Pucl, the only 
other cyclins known to have mitotic cell cycle functions in complexes 
with Cdc2, were the only mitotic cyclins that clustered with Cdc13 
(Supplementary Fig. 4). To simplify the network further, we therefore 
deleted cigl, cig2 and pucl (ACCP) in cde13-L-cdc2 A2A13 cells. This 
strain had no apparent cell cycle defects (Fig. lb-d and Table 1). In 
this background, the CDK module oscillated in abundance, peaking at 
the end of G2 and disappearing at mitotic exit, and recapitulated the 
normal cell cycle changes in Cdc13 subcellular localization” (Fig. le 
and Supplementary Fig. 5). 

These results demonstrate that a single monomolecular CDK module 
lacking several regulatory features of the endogenous machinery is suf- 
ficient to trigger the two major cell cycle transitions and sustains an 
effective mitotic cycle. Other endogenous cyclin/CDK complexes that 
may have more peripheral roles in cell cycle regulation, including Mcs2/ 
Mcs6"! and Pas1/Pef1*”, cannot substitute for the CDK fusion and so do 
not have direct roles in driving the onsets of S and M phases. 


Oscillations in CDK activity 
Next we investigated how a single CDK module distinguishes between 
G1/S and G2/M. Using cdc13-L-cdc2,,, A424 13ACCP cells (Table 1 and 
Supplementary Fig. 6), we asked whether progression through S and 
M in the absence of much of the canonical regulation is primarily 
mediated by distinct thresholds of a single CDK activity. 

First, different inhibitor concentrations were added to synchronized 
cells in early G2. Timing of mitosis was delayed in a dose-dependent 
manner, with concentrations of 300 nM and above preventing mitosis 


Table 1 | Characterization of cells operating with the minimal CDK 
module 


Genotype Size at division Generation time Dead cells 
(um) (min) (%) 

Wild type 14+0.1 160+3 13+08 

cdc13-L-cdc2 A2A13 15.6+0.3 163 +2 ND 

cdc13-L-cdc2 A2A13ACCP 15.9+0.2 162+6 0.2+0.2 

cdc13-L-cdc2,, A24A13ACCP 14.7+0.2 158+6 ND 

cdc13-L-cdc2 A2A13ACCP Arum1* 15.9+0.1 15523 ND 

cdc 13-L-cdc2AF A2413ACCP* 13.9202 248 +0 5.3209 

cdc13-L-cdce2 A2A13ACCP Aweel 13.740 217 +0 62213 

Amik1 

wee1-50:,+ 6.9+0.1 ND ND 


Numbers are averages of three independent experiments with standard errors (n= 50 for cell size 
determination and = 400 for dead cells). ND, not determined. 

* Similar sizes were obtained using NmPP1 sensitive strains. 

+ Cell size at division was measured after 6 h at restrictive temperature (36 °C). 
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(Fig. 2a and Supplementary Fig. 7a—d). Cells delayed in G2 elongated 
and accumulated the fusion protein (Supplementary Fig. 8a, b). We 
surmised that a critical ratio of CDK module to inhibitor concentration 
must be reached to allow mitotic onset. Consistent with this, a popu- 
lation of large cells incubated with inhibitor in early G2 entered mitosis 
earlier than similarly treated small cells, but at the same size (Sup- 
plementary Fig. 8c—g). Moreover, cells of asynchronous cultures treated 
for 8h with different NmPP1 concentrations had longer sizes at divi- 
sion, proportional to the amount of inhibitor (Supplementary Fig. 7e). 
Second, entry into S phase was monitored when synchronized cells 
were exposed to inhibitor in G1 (Supplementary Fig. 9a). None of the 
concentrations that affected G2/M had any effect on S phase (data not 
shown), even though G1 cells have lower levels of fusion protein 
(Supplementary Fig. 5). However, a dose-dependent delay in S-phase 
onset was observed when 1-5 uM NmPP1 was used (Fig. 2b and 
Supplementary Fig. 9b-e). The difference in inhibitor concentration 
required to delay G1/S and G2/M supports the view that these transi- 
tions are associated with low and high kinase activities, respectively. 
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Figure 2 | Oscillation of a single CDK activity between two thresholds. 

a, Percentage of binucleated cells (includes septated cells; n = 400) in 
synchronized cdc13-L-cdc2,; A2413ACCP cultures incubated with NmPP1 
(added in early G2; T = 0 in Supplementary Fig. 7a). Concentrations above 
300 nM also prevented mitosis (data not shown). b, DNA content analysis of 
synchronized cdc13-L-cdc2,, A2413ACCP cells treated with NmPP1 after 
mitotic onset (T = 0 in Supplementary Fig. 9a; flow cytometry profiles are in 
Supplementary Fig. 9c). Inhibitor-treated cells arrested in the next G2 and 
became elongated (data not shown). Block: 1 1M NmPP1 for 2h 45 min at 
32°C. ¢, Inhibitor-mediated oscillation in activity using cdcl134DB-L-cdc2,, 
cdc2-33,, Acig2 cells (see Supplementary Figs 10 and 11 for protocol and 
description of the entire experiment). DAPI/Blankophor staining (left panel) 
and DNA content analysis (right panel) at representative times during the first 
artificial cycle. Scale bar, 10 jtm. The percentage of cells with a 1C DNA content 
after 40 min in 7.5 1M NmPP1 is indicated. 
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These data indicate that oscillation of qualitatively the same CDK 
activity between two thresholds may be the sole requirement to drive 
the minimal cell cycle. This interpretation predicts that substituting 
the oscillation in protein levels with an inhibitor-mediated oscillation 
in activity using a constantly present and non-degradable form of the 
CDK module should be sufficient to artificially drive the entire cycle. 
To test this, we deleted the Cdc13 destruction box within the fusion 
protein (cdcl134DB-L-cdc2,;) and drove its expression by the indu- 
cible urg1 regulatory elements” (Fig. 2c and Supplementary Figs 10 
and 11). cdc13ADB-L-cdc2,, cdc2-33,, Acig2 cells (see Methods) main- 
tained at restrictive temperature entered mitosis when expression of 
the fusion cassette was induced. Impaired degradation of the protein 
prevented mitotic exit, but addition of 7.5 4M NmPP1 allowed com- 
pletion of mitosis and cytokinesis with cells arresting in Gl. 
Subsequent reduction to 1 {1M inhibitor resulted in rapid DNA rep- 
lication. Finally, another cycle of inhibitor oscillation allowed cells to 
proceed through the next mitosis. 

These results show that artificial modulation of the activity of a 
single stable CDK module enables progression through S and M, 
supporting the idea of a quantitative CDK model of the cell cycle’****». 
We propose that changes in protein levels and cyclin/CDK ratios are 
not essential to cell cycle regulation and that a simple oscillation in 
activity generated by a minimal control system is sufficient to drive 
the mitotic cycle. 


Resetting the cell cycle 


Next we asked whether the oscillator itself constitutes the primary 
system that sets the order and separation of cell cycle events. CDK 
activity was manipulated in cdc13-L-cdc2,, 424134CCP cells to 
determine if this alone could change cell cycle architecture. 

First, G2-arrested cells were released into medium with varying 
concentrations of inhibitor (Fig. 3a and Supplementary Fig. 12a-d). 
Cells released in dimethylsulphoxide (DMSO) resumed cycling, 
whereas 1 or 2.5uM NmPP1 maintained the G2 block (data not 
shown). In contrast, treatments with 5 uM NmPP1 and more led to 
replication without an intervening mitosis, after delays reflecting the 
concentrations of inhibitor used and with increased amounts of 
fusion protein. These data demonstrate that when CDK activity is 
reduced to a low level, G2 cells bypass mitosis and enter a G1/S-like 
program. This is consistent with earlier studies showing that loss of 
cdc13, overexpression of the CDK inhibitor Rum1 or chemical inhibi- 
tion of Cdc2 induces re-replication’®*”””**””. In these cases, however, 
Cig] and Cig2 are required, indicating that the minimal CDK network 
used here renders cells independent of additional regulation present 
in wild-type cells. Finally, G2-arrested cdc13-L-cdc2,, A2A13ACCP 
cells were subjected to a pulse of 10 11M NmPP1. Subsequent release 
into 1 uM inhibitor resulted in rapid entry into S phase without an 
intervening mitosis, showing that re-replication does not require per- 
sistence of low CDK activity (Supplementary Fig. 12e-g). 

Next we investigated how G1 cells respond to an abrupt switch to 
high CDK activity. G2-arrested cdc13-L-cdc2,, 42A413ACCP cells 
were reset in Gl as in Supplementary Fig. 12e, bypassing mitotic 
degradation of the fusion protein, but released into inhibitor-free 
medium. An overlap between S and M was observed in most cells, 
resulting in aberrant nuclei and ‘cut’ phenotypes (GI reset; Fig. 3b-d). 
Similar results were obtained when synchronized cells were main- 
tained in G1 to allow accumulation of the fusion protein and then 
released into inhibitor-free medium (G1 arrest; Fig. 3b-d and 
Supplementary Fig. 13b). In this latter case, although bulk DNA syn- 
thesis appeared complete shortly after release, labelling of newly syn- 
thesized DNA showed that replication was still occurring when 
mitotic phenotypes were apparent (Supplementary Fig. 13c). 
Moreover, the presence of aberrant nuclei reflected late stages of the 
mitotic process. In both experiments, release into 1 1M NmPP1 sup- 
pressed the mitotic phenotype but allowed DNA replication (data not 
shown), indicating that these effects are due to early induction of high 


1076 | NATURE | VOL 468 | 23/30 DECEMBER 2010 


c --DMSO d 
#10 uM |G1 reset G1 reset 
—a G1 arrest DMSO 10 uM G1 arrest 
ae 
_ 100 , 4 
c 604 | 
ivi 
5 80 f. al 
ao c 
S 2 60 £ 40, 
Do 10) 
#9 40 £ 20! 
8 20 a 
S J 
a 0 oro OF i i J 
oof 0 10 20 30 40 Block; 4 1 
Time (min) 2C 4C 2C4C 1C 2C 


Figure 3 | Resetting the cell cycle. a, DNA content analysis of G2-arrested 
cdc13-L-cdc2,, A2A13ACCP cells released in various concentrations of NmPP1 
(T= 0). Block: 1 tM NmPP1 for 2 h 45 min at 32 °C. Black profiles show 
S-phase onset. Cells treated with inhibitor did not undergo mitosis 
(Supplementary Fig. 12a). b-d, G1 reset: cdc13-L-cdc2,, 424A 134CCP cells were 
treated as in Supplementary Fig. 12e but released into inhibitor-free medium 
(T= 0). G1 arrest: Synchronized cdc13-L-cdc2,, A24134CCP cells as in 
Supplementary Fig. 9a were arrested in G1 for 2h with 101M NmPP1 before 
release (T' = 0). b, DAPI/Blankophor staining (see Supplementary Fig. 13a for 
the entire time course). Scale bar, 10 um. c, Percentage of aberrant mitotic 
nuclei, including elongated, fragmented, asymmetrically divided and ‘cut’ 
nuclei (n = 400). d, DNA content analysis. Black profiles show first detection of 
significant DNA synthesis. 


CDK activity which simultaneously brings about S and M. 
Furthermore, this shows that S phase can proceed with high CDK 
activity as long as cells have previously experienced low activity. 
Finally, when S-phase progression was blocked using hydroxyurea 
in these experiments, cells entered S and M simultaneously but failed 
to segregate distinct DNA masses (Fig. 4a and Supplementary Fig. 14), 
supporting the interpretation that the observed phenotype is a con- 
sequence of the mitotic machinery attempting to segregate incomple- 
tely replicated DNA. These data establish that cells with sufficient 
CDK activity can enter mitosis from inappropriate points in the cell 
cycle regardless of their previous state. 

The apparent independency of S and M conflicts with models that 
link mitosis with completion of DNA replication through the S-phase 
checkpoint. Surprisingly, we found that the checkpoint remained inac- 
tive (Fig. 4b and Supplementary Fig. 15a), indicating that it did not 
sense this overlap between S and M as a pathological situation. 
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Figure 4 | Timing and directionality of the minimal cell cycle. a, Percentage 
of aberrant mitotic nuclei in cells treated as in Fig. 3b—d but released in normal 
medium or in 12 mM hydroxyurea (HU) (n = 200). b, Western blot as in 
Fig. 3b-d probed for the checkpoint effector Cds1. Controls are G2-arrested 
cdc13-L-cdc2,, A2A13ACCP cells (1 uM NmPP1 for 2h 45 min at 32 °C) that 
were released in 12 mM hydroxyurea, and proteins were extracted 2h after 
release. Phosphorylation of Cds1 upon checkpoint activation results in a 
mobility shift (Methods). c, Western blot for the hydroxyurea-treated cells in 
a probed for Cds1. Despite the rapid activation of the S-phase checkpoint, S and 
M phases occurred simultaneously (a and Supplementary Fig. 14a). Controls 
are as in b. d-f, Ectopic activation of the checkpoint using hydroxyurea 
prevents mitosis when the onsets of S$ and M are separated (modified G1 reset 
protocol, Supplementary Fig. 15b). d, Percentage of aberrant mitotic nuclei 


Furthermore, ectopic activation of the checkpoint using hydroxyurea 
only prevented mitosis when entry into M phase was separated from the 
onset of S phase by temporary incubation in 1 4M inhibitor before 
release (Fig. 4c-f and Supplementary Fig. 15d). This demonstrates that 
the S-phase checkpoint can only have a role when the basic sequence of 
cell cycle events is pre-established by proper kinetics of CDK oscillation. 

We propose that the different phases of the mitotic cycle, defined by 
specific CDK activity thresholds, can operate independently of each 
other, establishing that the core cell cycle lacks inherent directionality. 
The oscillation of a single CDK activity can form the basic engine that 
provides directionality to this circuit and imposes the temporal order 
of S and M (Fig. 4g). 


CDK regulatory loops and size control 

The major systems directly regulating CDK activity in fission yeast 
involve the CDK inhibitor Rum1****** and control of Cdc2 phosphor- 
ylation by Weel, Mik] and Cdc25*””"’. We asked if these mechanisms 
are part of the core regulation or if they are only relevant in normal 
cells with a more complex CDK machinery. 

Neither deregulation of Rum1 through deletion of cig] and puc 
nor deletion of rum1 in cdc13-L-cdc2 A2413ACCP cells had any effect 
on vegetative growth and checkpoint responses, but 4rum1 cells did 
not arrest in G1 after nitrogen starvation, consistent with previous 
observations” (Table 1 and Supplementary Figs 16 and 18). 

Cdc2 phosphorylation has central roles in the control of cell size at 
division®’, the S-phase and DNA-damage checkpoints***, and the 
response to changes in nutrient availability**””. To determine the 
importance of this regulation in the minimal cell cycle, Thr 14 (which 
can be phosphorylated by Wee1**) and Tyr 15 (ref. 41) were altered 
(cdc13-L-cdc2AF cassette producing a Cdcl3-L-Cdc2(T14A, Y15F) 
fusion protein; Supplementary Fig. 17). Whereas cells operating with 
a Cdc2(Y15F) protein showed poor viability”, cdc13-L-cdc2AF 
A2A13ACCP cells were surprisingly healthy, although with a longer 
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(n = 200). Similar results were obtained using a G1 arrest-derived protocol 
(data not shown). e, DAPI/Blankophor staining. Scale bar, 10 im. f, Western 
blots probed for Cds1, phospho-Cdce2(Y15), Cdc13 and tubulin. Controls are as 
in b. Cells incubated for 40 min with 1 1M inhibitor entered S phase before 
release (Supplementary Fig. 15c), resulting in checkpoint activation and 
inhibition of mitosis. Note that proper mitotic exit is set up in cells undergoing 
simultaneous S and M as shown by the degradation of the fusion protein 

(10 uM, T = 30 min). g, The core cell cycle solely relies on changes in 
qualitatively the same CDK activity and lacks global timing and directionality 
(left panel). A temporal sequence is imposed on the critical independent cell 
cycle events by the characteristic oscillation of CDK activity between two 
thresholds (right panel: Ts, S-phase threshold; T),, M-phase threshold). The 
precise kinetics of CDK activity accumulation are unknown (dashed line). 
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generation time (Fig. 5a and Table 1). Confirming these results, cdc13- 
L-cde2 A2A4134CCP Aweel Amik1 cells had similar characteristics 
(Fig. 5a, Table 1 and Supplementary Fig. 17), despite the co-lethality 
of weel and mik1 in a normal background”. As expected, the S-phase 
and DNA-damage checkpoints, which are operative in cdc13-L-cdc2 
A2A13ACCP cells, were impaired in the AF mutant (Supplementary 
Fig. 18). Although an imbalance in this regulatory loop affects the 
functioning of the wild-type fusion protein (Supplementary Fig. 19), 
these data show that simplifying the cell cycle network relieves cells 
from tight regulation by Weel, Mikl and Cdc25. 

In contrast to wee! mutants, which divide at 50% of wild-type size, 
cdc13-L-cdc2AF A2A13ACCP cells divided only 15% smaller than cdc13- 
L-cdc2 A2A13ACCP cells (Table 1). Nevertheless, G1 was elongated as in 
weel cells* (Fig. 5b). cdc13-L-cdc2AF A2A13ACCP cells also showed a 
higher variability in size at division (Fig. 5c). However, the majority of 
cells (67%) divided at a size within the range of variation observed for the 
control strain. Furthermore, despite this increased heterogeneity, G2- 
arrested cdc13-L-cdc2AF,,, A2A13ACCP cells returned to their normal 
average size efficiently upon release (Fig. 5d and Supplementary Fig. 20). 
These data indicate that a Weel-independent size control system is 
operative in these cells and that Weel may integrate additional path- 
ways that render this control more accurate. 


Discussion 

We have shown that a minimal control network based on a single 
monomolecular CDK module can autonomously drive the fission 
yeast cell cycle, indicating that differential expression, degradation 
and subcellular localization of CDK subunits, attainment of specific 
ratios of cyclin to CDK, and cyclin-mediated changes in substrate 
specificity are not essential for cell cycle progression. We demonstrate 
that the CDK oscillator provides timing and directionality to a simple 
circuit of two activity thresholds that define independent cell cycle 
phases, and prevails over the S-phase checkpoint in organizing the 
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Figure 5 | Role of Cdc2 T14 and Y15 phosphorylation. a-d, All strains 
carried deletions of the endogenous copies of cdc2, cdc13, cig1, cig2 and pucl. 
a, b, Labels are indicated in box. a, Blankophor staining of exponentially 
growing cells. Scale bar, 10 um. b, DNA content analysis. The percentages of 1C 
cells are indicated. c, Distribution of cell size at division in exponentially 
growing cultures presented as a percentage of the median size (n = 150). d, G2- 
arrested cells (1 tM NmPP1 for 3 h 30 min at 32 °C) were released and cell size 
at division determined at the peak of binucleated cells for the following three 
cycles (n = 50; Supplementary Fig. 20). Box and whisker plot. Async, 
asynchronous cultures. 


mitotic cycle (Fig. 4g). We propose that this minimal architecture 
reveals the core control of the eukaryotic cell cycle. Although cell cycle 
regulation is more elaborate in multicellular eukaryotes, the redundancy 
observed for metazoan CDK subunits''* indicates that our conclusions 
may also be relevant for more complex cells. 

We show also that regulation by Weel, Mik1 and Cdc25 is dispensable 
for the minimal cell cycle. Interestingly, cdc13-L-cdc2AF A2A134CCP 
cells divide at a length close to wild type, indicating that the Pom] 
gradient operating through Weel*”® cannot be the only system that 
prevents short cells from dividing. Cell-to-cell variation in size at division 
is higher in this strain. This could result from increased heterogeneity in 
the timing and degree of CDK activation that may perturb the Weel- 
independent size control. Feedback signalling through CDK phosphor- 
ylation may promote size homogeneity by reducing potential noise in 
core CDK expression, stability or activation. In other systems, CDK 
activation shows Wee1/Cdc25-dependent stepwise and hysteretic prop- 
erties’ °, which provide sharp transitions and directionality to the cell 
cycle. In cdc13-L-cdc2AF A2A13ACCP cells, CDK activity may therefore 
rise more progressively, altering the kinetics of substrate phosphorylation 
and resulting in the observed phenotypes. 

It is unclear howa single CDK activity can sequentially trigger DNA 
replication and mitosis, as the possible overlap between S and M estab- 
lishes the simultaneous presence of G1/S and G2/M substrates. The 
CDK may have higher affinity for G1/S substrates. Coupled with 
periodic cyclin degradation, this would allow temporal separation of 
S and M. Alternatively, activity-dependent changes in subcellular 
localization of the whole CDK machinery may provide substrate spe- 
cificity. It is also possible that specific phosphatases target G2/M sub- 
strates more readily. In G1, the significant CDK activity differential— 
from close to zero to a low level—coupled to a lower phosphorylation 
turnover would allow accumulation only of phosphorylated G1/S 
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substrates. G2/M substrate-specific phosphatases would establish a 
futile cycle with Cdc2, allowing the more modest differential in CDK 
activity at the end of G2 to produce a significant increase in net 
phosphorylation of G2/M substrates. 

The results presented here may have evolutionary implications. A 
single oscillating CDK module could be the way primitive eukaryotes 
regulated their cell cycle. Subsequent selection would have introduced 
other regulatory layers to improve and fine-tune the core system. Cells 
may have become dependent on these additional elements, rendering 
them essential in modern cells and making it more difficult to fully 
appreciate the core processes involved. 


METHODS SUMMARY 


Standard methods for molecular biology, genetics and microscopy are 
detailed in Methods. Strains are listed in Supplementary Table 1. 
Experiments were carried out in supplemented minimal medium at 
32 °C, except where otherwise noted, with various concentrations of 
NmPP1 inhibitor (TRC). Cell size measurements were made from 
images of blankophor-stained cells. DNA was visualized in heat-fixed 
cells using 4',6-diamidino-2-phenylindole (DAPI) and in live cells 
using Hoechst. Western analyses were performed using total protein 
extracts normalized by amounts of proteins except where otherwise 
noted. DNA content was analysed using a BD FACSCalibur. 


Full Methods and any associated references are available in the online version of 
the paper at www.nature.com/nature. 
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METHODS 


Strains and growth conditions. Standard media and methods were used*’*’. 
Strains used in this study are listed in Supplementary Table 1. All experiments 
were carried out in minimal medium plus supplements (EMM4S) at 32 °C except 
where otherwise noted. The CDK module is a fusion of cdc13 and cdc2 open 
reading frames without introns. The fusion cassettes were cloned in a vector 
adjacent to the ura4* cassette; flanking regions allowed a restriction fragment 
to replace the leu1 gene by homologous recombination. The NmPP1 inhibitor 
(A603003; TRC) was dissolved in DMSO at a stock concentration of 10 mM and 
added to the cultures at the indicated concentrations. The first 67 amino-terminal 
residues of Cdc13 were deleted in the Cdc13ADB-L-Cdc2,, protein®’. Expression 
of the cdc13ADB-L-cdc2,, cassette was induced by addition of 250 mg! ! uracil to 
the medium. For hydroxyurea treatments in Fig. 4 and Supplementary Figs 14 
and 15, hydroxyurea was added 10 min before release from the inhibitor. The 
alteration in cdc13-117,, cells (Cdc13(C379Y)) was determined by sequencing. 
The cig] A::ura4™ , cig2A::ura4™ , puclA::ura4*, mik1A::leu2* deletions and the 
cdc2-33;s, wee1-50,; and cdc25-22,, alleles have been previously described'*7****. 
The rad34::ura* deletion was a gift from R. Daga. The cdc2A::kanMX6, 
cdc13A::natMX6, rum1A::hphMX6 and cig2A::natMX6 deletions were exact 
replacements of the open reading frames as described”. 

Cell size measurement and DNA staining. For size measurement, live cells were 
stained with Blankophor (MP Biochemicals). For DNA staining, cells were either 
heat-fixed on microscope slides and stained with DAPI (with 1:4 blankophor 
where indicated) or stained live with 50 jg ml” ' Hoechst DNA stain. Images were 
acquired in Metamorph (MDS Analytical Technologies) using an Axioplan 2 
(Carl Zeiss) epifluorescence microscope and a CoolSNAP HQ camera (Roper 
Scientific). Cell size was determined in ImageJ (National Institutes of Health) 
using the Pointpicker plug-in. 

Protein extracts and western blots. Western blots were performed on total 
protein extracts. Protein extracts in Fig. 1 and Supplementary Figs 6, 16 and 17 
were prepared using NaOH extraction®'. In all other cases, cells were frozen in 
liquid nitrogen, broken with glass beads in the presence of protease and phos- 
phatase inhibitors (Roche) and resuspended in SDS buffer. Samples were normalized 
by amounts of proteins except where otherwise noted. Antibodies used: Cdcl3 
polyclonal (SP4 (ref. 20); 1:2,500), Cds1 polyclonal (1:3,000, a gift from N. 
Rhind”), phospho-Cdc2(Y15) polyclonal (Cell Signaling; 1:300) and tubulin mono- 
clonal (1:10,000, a gift from K. Gull®). Activation of the S-phase checkpoint was 
monitored by the phosphorylation-dependent shift in Cds1 mobility by 8% SDS- 
polyacrylamide gel electrophoresis. 

Flow cytometry profile interpretation. DNA content analysis was performed by 
flow cytometry using ethanol-fixed and propidium-iodide-stained cells (2 1g 
ml! propidium iodide in 50 mM sodium citrate) and a BD FACSCalibur. The 
fission yeast cell cycle has a very short G1, and cells undergo DNA replication 
before cytokinesis. As a result, fission yeast cells spend most of their cell cycle with 
a 2C DNA content. In synchronized cultures, a transient 4C peak appears as S 
phase occurs in post-mitotic binucleated cells. This is resolved upon cytokinesis, 
producing mononucleated 2C cells. This phase only represents a small fraction of 
the population in an asynchronous culture; a larger 4C peak corresponds either to 
binucleated cells that have completed S phase but show a cytokinesis defect or to 
mononucleated G2 cells that have undergone an additional round of DNA rep- 
lication without intervening mitosis. The appearance of a 1C peak reflects an 
elongation of G1 resulting in cytokinesis taking place before completion of DNA 
replication. Intermediate, non-discrete profiles occur when cells divide although 
mitosis is not complete, resulting in the septum cutting through the DNA mass 
and subsequent aberrant distribution of the DNA; this is referred to as ‘cut’ cells®. 
Finally, cell size has an effect on the position of the flow cytometry profiles as non- 
nuclear staining increases with size: profiles are shifted to the right in long cells 
and to the left in newly divided cells, despite identical DNA contents”. 
Nuclear YFP quantification. Cells were imaged on agar pads under a coverslip 
and Z stacks were acquired using a DeltaVision RT microscope (Applied 
Precision). Quantification was performed on maximum projections using 
ImageJ (National Institutes of Health) as follows. Fluorescence intensity of equi- 
valent areas within the nucleus (N), the cytoplasm (C) and outside (B) of each 
cdc13-L-cdc2- YFP A2A13ACCP cell was measured (strongly stained structures 
such as the spindle or spindle pole body were excluded) as well as cell size. Similar 
measurements were performed in cdc13-L-cdc2 A24134CCP cells as a control. C 


was not significantly different than in the control cells and therefore was used as a 
normalization value. Using the cdc13-L-cdc2 A2413ACCP cells, we estimated the 
average auto-fluorescence (A) in the nucleus as a constant percentage of B. The 
value reflecting nuclear fluorescence for each cell was calculated as 
[N — B-—(AXB)]/C. For binucleated and septated cells, N was calculated as 
the average fluorescence of both nuclei. Values were then sorted by cell size. The 
rare cells showing a negative value were considered as negative for the YFP signal. 
Total YFP quantification by flow cytometry. The fluorescence (FL1-H) of 
cdc13-L-cdc2-YFP A2A13ACCP and cdc13-L-cdc2 A2A13ACCP cells in minimal 
medium at 25 °C was measured as a function of size (FSC-H) by flow cytometry 
using a BD FACSCalibur. YFP measurements were averaged in size bins of 250 
cells. The equation of the linear regression obtained in the control strain was used 
to subtract the auto-fluorescence background from the cdc13-L-cdc2-YFP 
A2A13ACCP measurements. 

Percentage of dead cells in liquid cultures. Liquid cultures were exponentially 
grown for 36 h at the appropriate temperatures in the presence of 25 mg!‘ of the 
dye phloxin B (phloxin B stains dead cells in pink). The percentage of dead cells 
was estimated by microscopy. 

Nitrogen starvation. Cells were exponentially grown at 28 °C in minimal medium 
with supplements, washed with water and inoculated at approximately 1.5 x 10° 
cells ml ' into non-supplemented minimal medium without nitrogen at 28 °C. 
EdU incorporation and detection. cdc13-L-cdc2,, 424 134CCP cells expressing 
the human ENT1 transporter and herpes simplex virus thymidine kinase” were 
incubated with 2 1M EdU (Invitrogen) for 5 min and fixed in 3.7% formaldehyde. 
After permeabilization (A. Kaykov, manuscript in preparation), EdU detection 
was performed according to manufacturer’s instructions (Invitrogen, Click-iT 
EdU Alexa Fluor 594 Imaging Kit) and imaged in Metamorph (MDS Analytical 
Technologies) using an Axioplan 2 (Carl Zeiss) epifluorescence microscope and a 
CoolSNAP HQ camera (Roper Scientific). 
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Images of a fourth planet orbiting HR 8799 


Christian Marois', B. Zuckerman’, Quinn M. Konopacky?, Bruce Macintosh? & Travis Barman* 


High-contrast near-infrared imaging of the nearby star HR 8799 
has shown three giant planets’. Such images were possible because 
of the wide orbits (>25 astronomical units, where 1 AU is the Earth- 
Sun distance) and youth (<100 Myr) of the imaged planets, which 
are still hot and bright as they radiate away gravitational energy 
acquired during their formation. An important area of contention 
in the exoplanet community is whether outer planets (>10 AU) more 
massive than Jupiter form by way of one-step gravitational instabil- 
ities” or, rather, through a two-step process involving accretion of a 
core followed by accumulation of a massive outer envelope com- 
posed primarily of hydrogen and helium’. Here we report the pres- 
ence of a fourth planet, interior to and of about the same mass as the 
other three. The system, with this additional planet, represents a 
challenge for current planet formation models as none of them can 
explain the in situ formation of all four planets. With its four young 
giant planets and known cold/warm debris belts*, the HR 8799 
planetary system is a unique laboratory in which to study the forma- 
tion and evolution of giant planets at wide (>10 AU) separations. 
New near-infrared observations of HR 8799, optimized for detecting 
close-in planets, were made at the Keck II telescope in 2009 and 2010. 
(See Table 1 for a summary.) A subset of the images is presented in 
Fig. 1. A fourth planet, designated HR 8799¢, is detected at six different 
epochs at an averaged projected separation of 0.368” + 0.003” 
(14.5 + 0.4 Av). Planet e is bound to the star and is orbiting anticlock- 
wise (see Fig. 2), as are the three other known planets in the system. The 
measured orbital motion, 46 + 10 mas yr_', is consistent witha roughly 
circular orbit of semimajor axis (a) 14.5 AU with a ~50-year period. 
Knowledge of the age and luminosity of the planets is critical for 
deriving their fundamental properties, including mass. In 2008 we 
used various techniques to estimate an age of 60 Myr with a plausible 


Table 1 | HR 8799e astrometry, photometry and physical 
characteristics 


Epoch, band, wavelength 


2009 Jul. 31, Kp band 2.124 um (+0.019") 
2009 Aug. 1, L’ band 3.776 um (+0.013") 
2009 Nov. 1, L’ band 3.776 um (+0.010") 
2010 Jul. 13, Ks band 2.146 um (+0.008") 
2010 Jul. 21, L' band 3.776 um (+0.011") 
2010 Oct. 30, L’ band 3.776 um (+0.010") 


Separation [E, N] from the host star 


[-0.299", -0.217"] 
[-0.303", —0.209"] 
[-0.304", -0.196"] 
[-0.325", -0.173"] 
[-0.324", -0.175"] 
[-0.334", -0.162”] 


Parameter Value 
Projected separation, avg. from all epochs* (Au) 14.5+0.4 
Orbital motion (arcsec yr!) 0.046 + 0.010 
Period for a face-on circular orbit (yr) ~50 

AKs 2.146 um+ (mag) 10.67 + 0.22 
AL’ 3.776 ums (mag) 9.37 +0.12 
Absolute magnitude at 2.146 um, Mx; (mag) 12.93 + 0.22 
Absolute magnitude at 3.776 um, M_: (mag) 11.61 20,12 
Luminosity (log La) -4.7+0.2 


Mass for 30*°° Myr (Mjup) 7% 
Mass for 607290 Myr (Mjup) 


*The projected separation error (in Au) also accounts for the uncertainty in the distance to the star. 
+ Planet-to-star flux ratios, expressed as difference of magnitude. No reliable photometry was derived 
for the Kp-band 2009 Jul. 31 data. 


range between 30 and 160 Myr (here we represent this as 60* 3)" Myr), 
consistent with an earlier estimate of 20-150 Myr (ref. 5). Two recent 
analyses (R. Doyon et al., and B. Zuckerman et al., manuscripts in 
preparation) independently deduce that HR 8799 is very likely to be 
a member of the 30 Myr Columba association®. This conclusion is 
based on common Galactic space motions and age indicators for stars 
located between the previously-known Columba members and HR 
8799. The younger age suggests smaller planet masses, but to be con- 
servative, we use both age ranges (30*7° Myr (Columba association) 
and 607 39° Myr') to derive the physical properties of planet e. 


a 21 July 2010, L’ band 


b 13 July 2010, Ks band 


c 1 November 2009, L’ band 


Figure 1 | HR 8799e discovery images. Images of HR 8799 (a star at 

39.4 + 1.0 pc and located in the Pegasus constellation) were acquired at the 
Keck II telescope with the Angular Differential Imaging technique (ADI)” to 
allowa stable quasi-static point spread function (PSF) while leaving the field-of- 
view to rotate with time while tracking the star in the sky. The ADI/LOCI’*” 
SOSIE software** was used to subtract the stellar flux, and to combine and flux- 
calibrate the images. Our SOSIE software” iteratively fits the planet PSF to 
derive relative astrometry and photometry (the star position and its 
photometry were obtained from unsaturated data or from its PSF core that was 
detectable through a flux-calibrated focal plane mask). a, An L-band image 
acquired on 21 July 2010; b, a Ks-band image acquired on 13 July 2010 (arrows 
ina andb point towards planet e); c, an L’-band image acquired on 1 November 
2009. All three sequences were ~1h long. No coronagraphic focal plane mask 
was used on 1 November 2009, but a 400-mas-diameter mask was used on 13 
July and 21 July 2010. HR 8799¢ is located southwest of the star. Planets b, c and 
d are seen at respective projected separations of 68, 38 and 24 Au from the 
central star, consistent with roughly circular orbits at inclinations of <40° (refs 
11-13). Their masses (7, 10 and 10 Mjup for b, c and d for 60 Myr age’; 5, 7 and 
7 Myup for 30 Myr age) were estimated from their luminosities using age- 
dependent evolutionary models”. North is up and east is left. 
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Figure 2 | HR 8799e 2009-10 astrometry. Main figure, the 2009-10 orbital 
motions of the four planets—b, c, d, e. Crosses denote the positions for 2009 and 
2010 first an last epochs for b, c and d, and for all six epochs for e. A square is 
drawn over the cross symbol of each planet’s first epoch. Inset, a zoomed 

version of planet e’s astrometry, including the expected motion (curved solid 
line) if it is an unrelated background object; each epoch is labelled by a number 
1-6; a dashed line connects the star to each epoch data point; error bars, +1s.d. 


HR 8799¢e is located very near planets c and d ina K, versus K, — L’ 
colour-magnitude diagram, suggesting that all three planets have sim- 
ilar spectral shapes and bolometric luminosities. We, therefore, adopt 
the same luminosity for these three planets; however, given the larger 
photometric error-bars and sparse wavelength coverage associated 
with planet e, we have conservatively assigned to it a luminosity error 
(Table 1) twice as large as those for planets b, c and d’. This luminosity 
estimate is consistent with empirically calibrated bolometric correc- 
tions for brown dwarfs’, although such corrections may be ill-suited 
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Planet e is confirmed as bound to HR 8799, and it is moving at 

46 + 10 mas yr‘ anticlockwise. In the main figure, the orbits of the giant 
planets of our Solar System (Jupiter, Saturn, Uranus and Neptune) are drawn to 
scale (light grey circles). With a period of ~50 years, the orbit of HR 8799e will 
be rapidly constrained by future observations; at our current measurement 
accuracy, it will be possible to measure orbital curvature after only 2 years. 


for young planets with distinct spectra and colours. Using the two 
overlapping age ranges outlined above and the evolutionary models 
described in the HR 8799bcd discovery article’, we estimate the mass of 
planet e to be 7*} Myupy (30 Myr) and 103 Myup (60 Myr), where Myup 
is the mass of Jupiter; see Fig. 3. The broadband photometry of planets 
b, c, and d provide strong evidence for significant atmospheric cloud 
coverage, while recent spectroscopy of planets b and c show evidence 
for non-equilibrium CO/CH, chemistry* °. Given the limited wave- 
length coverage of the discovery images for planet e, it is too early to 


Figure 3 | The mass of HR 8799e from the age-luminosity relationship. 
Solid lines are luminosity-versus-age tracks for planet evolution models” 
(luminosities are normalized to the solar luminosity, L5). Objects above 13 
Mhyup are typically considered to be outside the planet-mass regime; however, 
the tail end of the planet distribution found by radial velocity surveys extends 
above this [AU-defined mass limit”®. Boxed areas show adopted luminosity 
ranges (+1s.d.) and estimated age ranges for the four HR 8799 planets: cross- 
hatched boxes show age range 307) Myr; grey boxes show age range 

60730" Myr; planets c, d and e have similar luminosities, but the luminosity 
uncertainty for e is larger and indicated by the darker box/opposite hatch. For 
comparison, the ages and luminosities of four recently imaged planet-mass 
companions near other stars are indicated (numbered 1-4; see key on figure) 
showing 1s.d. error bars for the luminosity and estimated age ranges). An 
asteroseismology study suggested that the HR 8799 system might be as old as 
~1 Gyr (ref. 27), but it is highly unlikely that such an old star would have very 
massive debris belts”!”*; such an age would also require planetary masses far too 
high for long-term stability'’. The older age also requires an inclination of the 
stellar pole relative to the line of sight of ~50°, inconsistent with the nearly face- 
on planetary system and the ~25° inclination upper limit measured from 
Spitzer images of the outer dust halo*. Mass estimates based on any existing 
evolutionary model at ages as young as 20-30 Myr suffer from unconstrained 
initial formation conditions; the masses presented here could be 
underestimated if the planets formed by core-accretion, though ‘cold start’ 
core-accretion models” do not reproduce the observed luminosity for any 
combination of mass and age. While this additional uncertainty can lead 
temporarily to ambiguity about the planets’ masses and formation history 
(core-accretion or gravitation instability), it does highlight the importance of 
discovering and following in orbit planet-mass companions at ages when 
formation processes are important. 
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Figure 4 | Comparison of HR 8799 and our Solar System. Top, Solar System; 
bottom, the HR 8799 system. HR 8799 infrared data indicate the existence of an 
asteroid belt analogue located at 6-15 AU (we have moved the estimated outer 
edge of this belt to 10 Au because of planet e’s estimated chaotic region”’), a 
Edgeworth-Kuiper-belt-like debris disk at >90 Au and a small particle halo 
extending up to 1,000 au (ref. 4). The red shaded regions represent the locations 
of the inner and outer debris belts in both systems (the Solar System Oort comet 
cloud and the HR 8799 halo are not shown). The horizontal axis of the HR 8799 
plot is compressed by the square root of the ratio of the luminosity of HR 8799 
(4.92 + 0.41 Lo) to that of the Sun to show both systems over the same 
equilibrium temperature range. Given the current apparent separations of the 


say much about the atmospheric properties of this particular planet; 
however, given that its near-infrared colour is similar to those of the 
other three planets, we can anticipate similar cloud structure and 
chemistry for planet e. 

Stability analyses''’* have shown that the original three-planet sys- 
tem may be in a mean motion period resonance with an upper limit on 
planetary masses of ~20 Mjy,p assuming an age of up to 100 Myr. With 
the discovery ofa fourth planet, we revisit the stability of this system. We 
searched for stable orbital configurations with the HYBRID/Mercury 
package" using the 30-Myr (5, 7, 7 and 7 Mjup for b, c, d and e respec- 
tively) and 60-Myr (7, 10, 10 and 10 Mj,) masses. In our preliminary 
search, we held the parameters for b, c and d fixed to those matching 
either the single resonance (1:2 resonance between planets c and d only) 
or double resonance (1:2:4 resonance between planets b, c, and d) stable 
solutions found to date’’, but allowed the parameters for e to vary within 
the regime allowed by our observations. On the basis of the single- 
resonance configuration and using the 30-Myr masses, in 100,000 trials 
seven solutions for e were found that are stable for at least 160 Myr (the 
maximum estimated age of the system), and an additional five solutions 
were found that are stable for over 100 Myr. All maximally stable solu- 
tions have a semimajor axis of ~ 14.5 Au, with planets c, dand e ina 1:2:4 
resonance (planet b not in resonance). A set of 100,000 trials was also 
performed using the 60-Myr masses, but only two solutions were found 
that are stable for over 100 Myr, each of which requires a semimajor axis 
of ~12.5 au, 40 away from our astrometry. This is suggestive that a 
younger system age and lower planet masses are preferred, although a 
much more thorough search of parameter space is required (see the 
Supplementary Information for tables of stable solutions). 

The mechanism for the formation of this system is unclear. It is 
challenging for gravitational-instability fragmentation to occur at 
a< 20-40 au (refs 15,16)—ruling that mechanism out for in situ 
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four planets of HR 8799 and the preferred locations of the inner warm debris 
disk and the inner edge of the outer cold disk (90 Au)’, then (1) the indicated 4:1 
and 2:1 period resonances between the inner/outer edge of the warm debris belt 
and planet e, and (2) a 3:2 mean motion resonance of b with the inner edge of 
the outer cold disk, are both consistent with the observations. By analogy, the 
inner and outer edges of the main asteroid belt of our Solar System are, 
respectively, in 4:1 and 2:1 mean motion resonances with Jupiter. Many 
members of the Edgeworth-Kuiper belt, including Pluto, are in a 3:2 mean 
motion orbital resonance with Neptune. Solar System planet images are from 
NASA; HR 8799 artwork is from Gemini Observatory and L. Cook. Planet 
diameters are not to scale. 


formation of planet e. In addition, disk instability mechanisms pref- 
erentially form objects more massive than these planets’®””. If the HR 
8799 system represented low-mass examples of such a population, 
brown-dwarf companions to young massive stars would be plentiful. 
Nearby young star surveys'**° and our nearly complete survey of 80 
stars with similar masses and ages to HR 8799 have discovered no such 
population of brown dwarf companions. HR 8799e and possibly d are 
close enough to the primary star to have formed by bottom-up accre- 
tion in situ’’, but planets b and c are located where the collisional 
timescale is conventionally thought to be too low for core accretion 
to form giant planets before the system’s gas is depleted. A hybrid 
process with different planets forming through different mechanisms 
cannot be ruled out, but seems unlikely with the similar masses and 
dynamical properties of the four planets. It is possible that one mech- 
anism dominated the other and the planets later migrated to their 
current positions. The HR 8799 debris disk is especially massive for 
a star of its age (or for any older main sequence star*’), which could 
indicate an extremely dense protoplanetary disk. Such a disk could 
have induced significant migration, moving planets formed by disk- 
instability inward, or the disk could have damped the residual eccent- 
ricity from multi-planet gravitational interactions that moved core- 
accretion planets outward. The massive debris disk and the lack of 
higher-mass analogues to this system do suggest that HR 8799 repre- 
sents the high-mass end of planet formation. 

The HR 8799 system does show interesting similarities with our 
Solar System; all giant planets are located past the estimated snow line 
of each system (~2.7 au for the Solar System, ~6 Au for HR 8799), and 
the debris belts of each system are located at similar equilibrium tem- 
peratures (Fig. 4). With its four massive planets, massive debris belts 
and large scale, the HR 8799 planetary system is an amazing example 
of extreme systems that can form around stars. 
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Spin-orbit qubit in a semiconductor nanowire 


S. Nadj-Perge’*, S. M. Frolov'*, E. P. A. M. Bakkers’? & L. P. Kouwenhoven! 


Motion of electrons can influence their spins through a fun- 
damental effect called spin-orbit interaction. This interaction pro- 
vides a way to control spins electrically and thus lies at the 
foundation of spintronics'. Even at the level of single electrons, 
the spin-orbit interaction has proven promising for coherent spin 
rotations’. Here we implement a spin-orbit quantum bit (qubit) in 
an indium arsenide nanowire, where the spin-orbit interaction is 
so strong that spin and motion can no longer be separated**. In this 
regime, we realize fast qubit rotations and universal single-qubit 
control using only electric fields; the qubits are hosted in single- 
electron quantum dots that are individually addressable. We 
enhance coherence by dynamically decoupling the qubits from 
the environment. Nanowires offer various advantages for quantum 
computing: they can serve as one-dimensional templates for scalable 
qubit registers, and it is possible to vary the material even during 
wire growth’. Such flexibility can be used to design wires with sup- 
pressed decoherence and to push semiconductor qubit fidelities 
towards error correction levels. Furthermore, electrical dots can 
be integrated with optical dots in p-n junction nanowires*’. The 
coherence times achieved here are sufficient for the conversion of 
an electronic qubit into a photon, which can serve as a flying qubit 
for long-distance quantum communication. 

Figure 1a shows a scanning electron microscope image of our nano- 
wire device. Two electrodes, source and drain, are used to apply a 
voltage bias of 6mV across the InAs nanowire. Voltages applied to 
five closely spaced, narrow gates underneath the nanowire create a 
confinement potential for two electrons separated by a tunnelling 
barrier. The defined structure is known as a double quantum dot in 
the (1, 1) charge configuration’. 

Each of the two electrons represents a spin-orbit qubit (Fig. 1b). In 
the presence of strong spin-orbit coupling, neither spin nor orbital 
number is separately well defined. Instead, each qubit state is a spin- 
orbit doublet, {} and |}. Similar to pure spin states, a magnetic field, B, 
controls the energy splitting, Ez = guipB, between spin-orbit states, 
where g is the Lande g-factor in a quantum dot and jug is the Bohr 
magneton. The crucial difference from a spin qubit is that in a spin— 
orbit qubit the orbital part of the spin-orbit wavefunction is used for 
qubit manipulation’. 

The qubit read-out and initialization rely on the effect of spin 
blockade*”®. A source-drain bias induces a current of electrons passing 
one by one through the double dot. The process of electron transfer 
between the dots can be allowed energetically but blocked by a spin 
selection rule. For instance, a (1, 1) triplet state cannot change into a 
(0,2) singlet state. This stops the left-hand electron from tunnelling 
into the right-hand dot, and thereby blocks the current. In practice, the 
double dot becomes blocked only in a parallel configuration, that is, in 
either a ({}, }) or a ({j, |}) state, because antiparallel states decay quickly 
to anon-blocked singlet state''’’. By idling the qubits in the parameter 
range of spin blockade, they will be initialized in one of the two parallel 
states with equal probability. We note that spin-orbit and hyperfine 
interactions also mediate a slower decay of parallel states into 
(0,2)”°"°. This reduces the read-out fidelity to 70-80% (Supplemen- 
tary Information, section 5.1). 
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Figure 1 | Electric-dipole spin resonance. a, Scanning electron microscope 
image of a prototype device showing source (S) and drain (D) contacts, narrow 
gates one to five and wide gates B1 and B2. b, Left-hand (red) and right-hand 
(green) quantum dots are formed between gates two and five. A microwave 
electric field applied to gate four oscillates both electrons with amplitude ~ Ar, 
inducing EDSR. ¢, Spin blockade is lifted near B = 0 and on resonance when 
f= gupB/h. Here the microwave power is P= —42 dBm. I, current. d, Trace 
extracted from c at f = 9 GHz. e, Magnified view of the EDSR line, which is split 
at high B values owing to the difference between g; and gp. At each B value, the 
frequency is swept in a fixed range around fy = guigB/h (g = 9.28). The current 
at resonance varies owing to non-monotonic microwave transmission. 
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A microwave-frequency electric field applied to gate four oscillates 
electrons inside the nanowire (Fig. 1b). This motion can induce 
resonant transitions between spin-orbit states by means of an effect 
called electric-dipole spin resonance**!*'® (EDSR). Such transitions 
are expected when the frequency of the a.c. electric field is equal to the 
Larmor frequency, fo = gpB/h (h, Planck’s constant). At resonance, 
the spin-orbit state of the double dot rapidly changes from parallel to 
antiparallel. The antiparallel state does not experience spin blockade, 
so the left-hand electron tunnels to the right, thereby contributing to 
the current. Figure 1c shows the resonance as a V shape that maps out 
the Larmor frequency in the plane of microwave frequency and mag- 
netic field. 

The V-shaped resonance signal vanishes in the vicinity of zero 
magnetic field. This behaviour is consistent with spin-orbit mediated 
EDSR: the effect of spin-orbit interaction must cancel at zero field 
owing to time-reversal symmetry’. The field-dependent EDSR 
strength rules out a.c. magnetic field and hyperfine field gradient as 
possible mechanisms. A g-tensor modulation in our nanowires is esti- 
mated to be too weak to drive EDSR (Supplementary Information, 
section 2). The current peak near zero magnetic field arises from the 
hyperfine interaction between electron spin and the nuclear spin 
bath'’””. From the width of this hyperfine-induced peak, we extract 
the root mean squared magnetic field generated by the fluctuating 
nuclear spins, By = 0.66 + 0.1 mT (ref. 17). The width of the EDSR 
line at low microwave power is also consistent with broadening due to 
fluctuating nuclear spins’* (that is, the side EDSR peaks and the central 
hyperfine peak have comparable widths in Fig. 1d). 

At higher magnetic field, the resonance line splits (Fig. le), indi- 
cating that the g-factors in the left- and right-hand dots, g; and gp, are 
different. This is expected for quantum dots of different sizes because 
confinement changes the effective g-factor’’. We measured the con- 
finement as the orbital excitation energy at the (1,0) < (0, 1) trans- 
ition and found 7.5 + 0.1 meV for the left-hand dot and 9.0 + 0.2 meV 
for the right-hand dot. A smaller orbital excitation energy should 
correspond to a larger g-factor in InAs; therefore, we assign the values 
obtained from Fig. le to the left- and right-hand dots as follows: 
lg.| =9.2£0.1 and |gp| =8.9+0.1. At frequencies above 10 GHz, 
the two resonances are more than a linewidth apart, allowing us to 
control the left- and right-hand qubits separately*. 

Coherent control over spin-orbit states is demonstrated in a time- 
resolved measurement of Rabi oscillations*’*”°, explained in Fig. 2a, b. 
Periodic square pulses shift the relative positions of the energy levels in 
the two dots between spin blockade and Coulomb blockade. First, the 
double dot is initialized in a parallel state by idling in spin blockade. 
This is followed by a shift to Coulomb blockade, from which electrons 
cannot escape. While in Coulomb blockade, a resonant microwave 
burst is applied for a time Tphurst to induce qubit rotation. Finally, the 
double dot is brought back into spin blockade for read-out. At the 
read-out stage, the probability of the left-hand electron tunnelling 
out is proportional to the probability of projecting the final state onto 
the (1, 1) singlet. This cycle is repeated continuously. 

The singlet component in the final state is measured as the d.c. cur- 
rent. The current oscillates as Thurst is varied, reflecting Rabi oscillations 
of the driven qubit (Fig. 2c). Rabi oscillations are observed for driving 
frequencies in the range f~9-19GHz. Rabi oscillations are not 
observed at lower frequencies (and lower magnetic fields) because the 
effective spin-orbit field, Bso, is less than By, such that nuclear fluctua- 
tions average out the coherent qubit dynamics. We note that the obser- 
vation of incoherent EDSR (Fig. 1c) requires a much smaller Bso, 
because even qubit rotations with a random phase contribute to extra 
current near resonance. 

Our highest Rabi frequency is fg = 58 + 2 MHz (Fig. 2d), achieved 
at f= 13 GHz. The field Bgo is expected to grow with B (ref. 16); 
however, at higher driving frequencies the Rabi frequency is limited 
by the maximum microwave source power and by the reduced trans- 
mission of the microwave circuit. With the strongest driving, the 
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Figure 2 | Rabi oscillations. a, Charge stability diagram obtained by sweeping 
the voltages, V3 and V4, on gates three and four. CB, Coulomb blockade; SB, 
spin blockade. b, Measurement cycle with diagrams showing electrochemical 
potentials of the source (S), drain (D), left-hand dot (L) and right-hand dot (R) 
for each stage. c, Rabi oscillations for a range of microwave powers at 

f= 13 GHz and B= 102 mT. d, Rabi oscillations at f= 13 GHz, with fits to’! 
acos(frTburst + g)it4 + b (d= 0.8 for top trace and d = 0.5 for the bottom two 
traces). Rabi frequencies are 58 + 2, 43 + 2 and 32 + 2 MHz (top to bottom). 
Linear slopes of 2fAns_', 1fAns ' and 0.3fAns_' (top to bottom) are 
subtracted to flatten the average. They are attributed to photon-assisted 
tunnelling. Traces are offset vertically for clarity. e, Dependence of fg on driving 
amplitude, Vi = 2(P x 50 Q)°°, with a linear fit. f, Rabi oscillations with 
separated addressing of the left- and right-hand qubits at f= 18.66 GHz and 
B= 144mT (red) and 149 mT (green), with fg = 29 + 2 MHz fitted to the 
expression used in d. 


amplitude of the orbital oscillation is estimated to reach 1 nm. The 
qubit state is flipped in ~110 microwave periods, and thus rotated by 
~1.6° per cycle of the orbital motion. 

We can resolve up to five Rabi oscillation periods. The damping of 
the oscillations at a microwave power of P< —32 dBm is consistent 
witha~tpurst decay envelope observed previously for rotations of a 
single spin interacting with a slow nuclear bath”'. We have verified that 
the relaxation time, T;, does not limit coherent evolution on timescales 
up to 1 us (Supplementary Information, section 3). The qubit mani- 
pulation fidelity is 48 + 2%, estimated by comparing the values of Bso 
and By (ref. 18; Supplementary Information, section 5.3). As expected, 
the Rabi frequency is proportional to the square root of the microwave 
power, P, applied to the gate (Fig. 2e). Absorption of microwave 
photons allows interdot tunnelling regardless of the qubit state. This 
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effect probably accelerates the decay of Rabi oscillations near the highest 
power”’’ (Fig. 2d, top trace). However, the apparent photon-assisted 
tunnelling is substantially reduced for P< —32 dBm, although Rabi 
frequencies remain high. 

In Fig. 2c, d only the left qubit is rotated. Figure 2f shows data from 
coherent rotations of either the left- or the right-hand qubit induced at 
the same microwave frequency but at two different magnetic fields, 
which correspond to the two EDSR resonance conditions shown in 
Fig. le (ref. 22; Supplementary information, section 4). 

In the Rabi experiment, the qubit state is rotated around only one 
axis. This is not enough for full qubit operation, which ultimately 
requires the preparation of an arbitrary superposition of {} and |, 
known as universal control’? *. Such ability is demonstrated in a 
Ramsey experiment (Fig. 3a, b). Here two short bursts with different 
microwave phases are applied during the manipulation stage. In the 
reference frame that rotates at the Larmor frequency, the qubit is 
initially rotated from |+z) to |—y) on the Bloch sphere by applying 
a 7/2 rotation around the x axis. After a delay time, t, we apply a 31/2 
pulse. The tunable phase of the microwave signal, (, sets the axis of the 
second rotation (¢ = 0 corresponds to a rotation around x and ¢ = 1/2 
corresponds to a rotation around y). The final z component depends 
on the axis of the second rotation as well as on dephasing. The double- 
dot current oscillates with ¢, revealing Ramsey fringes (Fig. 3a). The 
contrast of the Ramsey fringes decreases with increasing t, allowing us 
to determine the inhomogeneous dephasing time, T,* = 8 + 1 ns 
(Fig. 3b). 

Coherence can be extended by a Hahn echo technique, which partly 
cancels dephasing coming from a slowly varying nuclear magnetic 
field (Fig. 3c, d). In the echo sequence, a m pulse is applied halfway 
between the two 1/2 pulses. The contrast of the Ramsey fringes is 


31/2 b 
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Figure 3 | Universal qubit control and coherence times. a, Ramsey 
experiment sequence (top) and measurement of fringes I(#) for tT = 5 and 20 ns. 
The axes of the second rotation are indicated with red arrows on the Bloch 
spheres for three values of ¢. b, Decay of the Ramsey fringe contrast, 

AI = I(f = 2) — I(¢ = 0), fitted to exp[—(t/T*)*]. c, Hahn echo sequence 
(top) extends fringe contrast beyond t = 34ns. Fringes for two orthogonal 
phases of the 7 pulse (7,, blue; Ty, red) are out of phase. d, Decay of the fringe 
contrast obtained for the two Hahn echo sequences (1,, blue; ,, red) is used to 
extract T.cho from a fit to exp[—(t/ Techo)’]. A fit to exp[—(t/Techo)’] gives a 
similar value of T.cho. In this figure, the duration of a 7 pulse is 14 ns, with 
P= -—35 dBm, f = 13 GHz and B = 102 mT. 
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Figure 4 | Dynamical decoupling. a, Decay of the contrast of the Ramsey 
fringes for CPMG sequences (top) with an increasing number of 1 pulses, Nz. 
Solid lines are fits to exp[—(t/ Tcpma)’]. Inset, coherence times Tepyig versus 
N,, are fitted to N,,“ with d = 0.53 + 0.1. Error bars are standard deviations of 
AI(t) fits. b, Ramsey fringes for four different phases of the initial 1/2 pulse 
obtained for an N,, = 3 CPMG sequence (shown above the panel) with 

t = 150ns. The input states are indicated with arrows on the Bloch sphere. In 
this figure, the duration of a m pulse is 8 ns with P = —32 dBm, f= 13 GHz and 
B=102mT. 


extended to longer coherent evolution times by performing Hahn 
echo (Fig. 3c). The phase of the fringes can be flipped depending on 
whether the 7 rotation is around the x axis (1) or around the y axis 
(my). Hahn echoes of both these types increase the coherence time to 
Techo = 50 + 5 ns (Fig. 3d). 

Gate-defined spin qubits were previously only realized in lateral 
quantum dots in GaAs-AlGaAs two-dimensional electron gases’. 
Owing to the much stronger spin-orbit interaction in InAs, the Rabi 
frequencies in our nanowire spin-orbit qubits are more than an order 
of magnitude higher than in GaAs dots’. Dephasing times, T,*, are of 
the same order in InAs and GaAs quantum dots”*”®. The relatively low 
Techo found in the present work encourages further study. A likely 
reason is faster nuclear spin fluctuations caused by the large nuclear 
spin of indium, I = 9/2. However, charge noise and nearby paramag- 
netic impurities cannot be ruled out as significant dephasing sources 
(Supplementary Information, section 6). Nanowires offer future solu- 
tions for suppressing the effects of nuclear spins, such as nanowires 
with sections of nuclear-spin-free silicon. The qubit can be stored in a 
silicon section of the nanowire and moved to an InAs section only for 
manipulation using spin-orbit interaction. 

In the present qubit, longer coherence times can already be achieved 
by Carr—Purcell-Meiboom-Gill (CPMG) dynamical-decoupling pulse 
sequences”’”* (Fig. 4a). Here a single echo z pulse is replaced with an 
array of equidistant m pulses, each of which refocuses the qubit state. 
The total time of coherent evolution grows as the number of r pulses is 
increased (Fig. 4a, inset). Importantly, an arbitrarily prepared qubit 
state in the x-y plane is preserved during the decoupling sequence. This 
is verified in Fig. 4b, which shows that the phase of the initial 1/2 pulse 
determines the phase of the Ramsey fringes. Similar evaluation was 
carried out for CPMG sequences of up to seven 1 pulses. In future, 
more efficient dynamical decoupling can be achieved using nuclear 


spin state preparation’’” in combination with faster 1 pulses or adia- 
batic pulse techniques”. 


METHODS SUMMARY 


We fabricate devices on undoped silicon substrates. Instead of a global back gate, 
two wide gates, B1 and B2, are located underneath the nanowire contacts (Fig. 1a). 
They are set to constant positive voltages to enhance conductance through the 
nanowire. The wide gates are covered by a 50-nm layer of Si;N, dielectric; on top of 
this layer narrow gates and a 25-nm layer of Si3N, are deposited. InAs nanowires 
with diameters between 50 and 80 nm are grown nearly free of stacking faults using 
metal-organic vapour phase epitaxy. The wires have the wurtzite crystal symmetry 
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with the c axis along the long nanowire axis. Nanowires are transferred in air from 
the mother chip to the device substrates, which already contain Ti-Au gates. 
Selected wires are contacted with ohmic Ti-Al electrodes, and during the same 
step contacts are made to the gates. We make measurements in a He’ refrigerator 
at T = 300 mK. A magnetic field is applied in the plane of the substrate at an angle 
of 45° + 5° with respect to the nanowire. We create high-frequency pulses using 
two arbitrary waveform generators (one gigasample per second) and a 20-GHz, 
23-dBm microwave vector source. Pulses are delivered to the sample through 
silver-plated CuNi coaxial lines with 36-dB attenuators followed by coplanar 
striplines printed on the sample holder. Square pulses are applied synchronously 
to gates two and four. Microwave bursts are applied to gate four. A measurement 
cycle lasts 2 [1s in the coherent rotations detailed in Fig. 2f. In the rest of the paper, a 
cycle lasts 600 ns and each data point is averaged over 5-40 million cycles. The 
pulse period should remain less than 2 1s to detect the double-dot current, which is 
limited by the noise floor of the d.c. current amplifier. 
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Atom-by-atom spectroscopy at graphene edge 


Kazu Suenaga' & Masanori Koshino! 


The properties of many nanoscale devices are sensitive to local 
atomic configurations, and so elemental identification and elec- 
tronic state analysis at the scale of individual atoms is becoming 
increasingly important. For example, graphene is regarded as a 
promising candidate for future devices, and the electronic properties 
of nanodevices constructed from this material are in large part 
governed by the edge structures’. The atomic configurations at gra- 
phene boundaries have been investigated by transmission electron 
microscopy and scanning tunnelling microscopy’™*, but the elec- 
tronic properties of these edge states have not yet been determined 
with atomic resolution. Whereas simple elemental analysis at the 
level of single atoms can now be achieved by means of annular dark 
field imaging® or electron energy-loss spectroscopy®’, obtaining 
fine-structure spectroscopic information about individual light 
atoms such as those of carbon has been hampered by a combination 
of extremely weak signals and specimen damage by the electron 
beam. Here we overcome these difficulties to demonstrate site- 
specific single-atom spectroscopy at a graphene boundary, enabling 
direct investigation of the electronic and bonding structures of the 
edge atoms—in particular, discrimination of single-, double- and 
triple-coordinated carbon atoms is achieved with atomic resolution. 
By demonstrating how rich chemical information can be obtained 
from single atoms through energy-loss near-edge fine-structure ana- 
lysis*, our results should open the way to exploring the local elec- 
tronic structures of various nanodevices and individual molecules. 
A low-voltage scanning transmission electron microscope (STEM) 
was used for the single-atom spectroscopy’. Flakes were cleaved from 
the synthetic highly oriented pyrolytic graphite (HOPG) and put onto 
the microgrids for energy-loss near-edge fine structure (ELNES) 


CCD counts (a.u.) 


290 
Energy loss (eV) 


analysis. STEM annular dark field (ADF) images indicate that the gra- 
phene flakes have open and active edges’ and that the edges are steadily 
etched by the incident electron beam when the probe-scanning is 
repeated at the same region (Supplementary Fig. 1). The accelerating 
voltage used here (60kV) is below the critical energy predicted for 
severe knock-on damage’’ and therefore the carbon atoms in bulk are 
mostly stable. Only the edge atoms are mobile during the observation, 
as indicated by the wiggling contrast frequently observed at the edge 
regions. The fast Fourier transformation of an ADF image of few-layer 
graphene shows that the spatial resolution of the experimental set-up is 
better than 0.106 nm (inset to Supplementary Fig. 1a) and so the hexa- 
gonal network of carbon atoms, separated by about 0.14 nm, is clearly 
visible in a monolayer region (Supplementary Fig. 1b). A probe of the 
same size and brightness was used for the following ELNES analysis. 

Figure la shows a typical ADF image of the edge region of a single 
graphene layer. The hexagonal network of carbon atoms in bulk is visible 
on the right-hand side of the image and the vacuum region appears in 
black on the left-hand side. The possible carbon atom positions derived 
from the local intensity maxima of ADF signals are marked by yellow 
circles after an image-smoothing process in Fig. 1b. There is strong wiggle 
contrast at the edge regions and some of the atom positions cannot be 
completely identified. We note that some of the hexagonal networks are 
imperfect and considerably reconstructed at the edge region. 

The typical ELNES spectra of carbon K (1s)-edge are displayed with 
their corresponding atomic positions in Fig. 1c. Figure 1d shows three 
characteristic carbon K-edge fine structures extracted using sequential 
electron energy-loss spectroscopy (EELS) with probe-scanning (known 
as the spectrum-image mode)’’. The spectrum in green was recorded at 
an atomic position in bulk (indicated by a green circle and arrow in 


Figure 1 | Graphene edge spectroscopy. a, ADF 
image of single graphene layer at the edge region. 
No image-processing has been done. Atomic 
positions are marked by circles in a smoothed 
image (b). Scale bars, 0.5 nm. d, ELNES of carbon K 
(1s) spectra taken at the colour-coded atoms 
indicated in b. Green, blue and red spectra 
correspond to the normal sp” carbon atom, a 
double-coordinated atom and a single-coordinated 
atom, respectively. These different states of atomic 
coordination are marked by coloured arrows in 

a and b and illustrated in c. CCD, charge-coupled 
device. 


300 310 
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Fig. 1b) as a reference. This spectrum exhibits the features of typical sp” 
coordinated carbon atoms, such as the sharp n* peak around 286 eV and 
the exciton peak of o* at 292 eV. These features are in good agreement 
with the previously reported spectra recorded from a bulk graphite 
specimen’’. The spectrum in blue was recorded from an edge atom 
located at the border of the hexagonal network with two-coordination, 
as illustrated in Fig. 1c. Remarkably, this spectrum has a extra peak 
around 282.6 + 0.2 eV (labelled D in Fig. 1d), with the n* peak having 
reduced intensity. Also the exciton peak intensity is considerably reduced 
and broadened compared to the bulk spectrum (marked by open circles). 

The spectrum in red shows similar features, also with weaker 1* 
peak and broadened o* peak. Its extra peak occurs at a different energy 
position of 283.6 + 0.2 eV (labelled S in Fig. 1d). It is extremely difficult 
to assign the atomic position completely for this red spectrum because 
the spectrum disappears quickly and is not fully reproducible. The 
edge region of the specimen tends to be strongly damaged and the 
edge morphology frequently changes after recording the spectrum 
image. Therefore, we can reasonably infer that this energy state must 
be somehow damage-related. One of the possible models for this edge 
structure is the Klein edge'*"*. The edge atom indicated in red in Fig. 1b 
is indeed single-bonded to its neighbour. The structure should be very 
unstable under the incident electron beam and so it may also explain 
the wiggling contrast often observed at the graphene edge. 

These spectral features involving peaks D and S have not previously 
been reported, to our knowledge. No fingerprinting method, compar- 
ing against the reference spectra of the existing polymorphic carbon, is 
able to explain them. We therefore performed ELNES simulations to 
correlate the experimental features with different atomic configura- 
tions (Fig. 2). The * peak shift to the lower energy is well reproduced 
for the edge atoms in the Klein, zigzag and armchair edge configura- 
tions (Fig. 2a, b and c), in comparison with the bulk carbon atom 
(Fig. 2d). The diminished excitonic effect can be confirmed for the 
Klein edge (Fig. 2a). The peak shift around 2 eV is well reproduced for 
the zigzag edge (Fig. 2b). In the spectrum of the armchair edge a sharp 
peak between m* and o* is expected (Fig. 2c). 

To show an atom-by-atom spectroscopy, we also performed EELS in 
the spectrum-line mode across a graphene edge. The probe scanned 
across the protruded carbon atom—the Klein edge—from the vacuum 
to the bulk region along the dotted line in Fig. 3a. A series of 100 spectra 
were sequentially recorded by scanning the electron probe with a con- 
stant step of about 0.02 nm. The total acquisition time was as small as 
50 s. The illustrated model in Fig. 3b shows that eight carbon atoms were 
investigated in the spectrum line. Figure 3c shows a profile of ADF 
signals (in red) that was simultaneously recorded with the ELNES spec- 
tra. It shows good agreement with the simulated profile (in blue) show- 
ing eight maxima sequentially corresponding to the eight carbon atoms. 
Although the experimental profile is rather scattered owing to specimen 
instability or a possible inclination of the specimen to the incident 
electron beam, which should produce a slight asymmetry in the profile 
of the carbon doublets, we can deduce the carbon atomic positions 
reasonably well from the line profile and extract the ELNES spectra 
corresponding to each atom. Figure 3d shows ELNES fine structures 
obtained in this way, with the corresponding atoms numbered in 
Fig. 3c (each spectrum presented consists of four spectra in total). 

The delocalization effect at the carbon K-edge (~290 eV) with an 
incident electron probe of 60 kV is estimated as 0.20-0.25 nm in classical 
theory* and as ~0.12 nm at 300 kV more recently'*. Therefore the EELS 
signals, if combined with the probe size (~0.1 nm), may not be com- 
pletely localized at the single atoms on which the probe is exactly 
positioned. However the series of ELNES spectra in Fig. 3d strongly 
suggest that site-specific spectroscopy is indeed possible with atomic 
resolution at the graphene edge. The spectrum from atom 1 clearly 
shows peak S at 283.6 eV (indicated by a dotted circle), which is related 
to the Klein edge, but spectrum 2 does not show peak S (it may be a 
minor feature). Spectrum 5 shows a small trace of peak D, which can 
reasonably be explained by a possible introduction of the bond-breakage 
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Figure 2 | ELNES simulations for three graphene edge structures. Carbon 
K-edge spectra simulated for the Klein edge (a), zigzag edge (b), armchair edge 
(c) and bulk (three-coordinated) atom (d). A core-hole was introduced by 
partially removing a 1s electron from the carbon atoms (indicated by pink 
shading) to estimate the relative peak shift of the spectra. The reduced exciton 
peak found experimentally is well reproduced. The simulated ELNES from the 
zigzag and armchair edges show at least a qualitative match with experiments, 
although the absolute value for the energy shift cannot be fully confirmed. 


during the probe-scanning across the atom. Spectrum 8 from an atom 
1.5 nm away from the edge shows normal sp” features with the sharp n* 
and excitonic o* peaks, which is very close to the bulk spectrum”. 

We performed intensity mapping of peaks D and S to estimate the 
delocalization effects further. A number of experiments, involving one 
set of spectrum-image and seven sets of spectrum-line on the graphene 
edges, are summarized in Supplementary Figs 3, 4 and 5. Results 
confirm that single-atom spectroscopy at specific sites of the graphene 
edge is indeed feasible with the reduced delocalization effect. 

We found no trace of oxygen at the investigated edges. This may 
contradict a generally accepted concept in which the graphene edge 
can be terminated by -OH or -COOH groups and the edge carbon 
atoms cannot be bared". In this experiment, in situ etching with con- 
tinuous removal of the carbon edge atoms in vacuum always takes 
place and therefore the edge structures are always kept fresh. 

From this study, we have picked up some practical information 
about the graphene edge engineering. The open edges involve both 
single- and double-coordinated carbon atoms but their specific edge 
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Figure 3 | Atom-by-atom spectroscopy across the Klein edge. a, ADF image 
of graphene edge (no image-processing). The dotted arrow indicates where the 
spectrum-line was made (A to B). Scale bar, 0.5 nm. b, An atomic model of the 
investigated edge. c, Line-profile of the ADF counts (in red) recorded 
simultaneously with the spectrum-line. For comparison with the simulated 
ADF counts (blue), the number of each atom is indicated (from 1 to 8). d, The 
carbon K-edge ELNES obtained from each atom across the Klein edge. The 
single-coordinated carbon atom (numbered 1) clearly shows peak S. 


states are completely localized at the atomic level. Even for triple- 
coordinated carbon atoms, slight electronic structure modification, 
as indicated by the restricted excitonic effect (or the reduced o* peak), 
may exist near the edge region but it vanishes after 1.5nm from the 
edge front. The properties of graphene nanoribbons with smaller 
widths might be governed by the edge effects’®. 

It is very surprising that the EELS signal delocalization has turned out 
not to be very important for atom-by-atom spectroscopy in the present 
experiment. The EELS signal delocalization should be substantially 
decreased when a lower accelerating voltage is used for the incident 
electron probe®. The delocalization effect with a 30-60kV incident 
probe is only a fraction of that for the normal STEM operation voltage 
at 200-300 kV. Lowering the accelerating voltage of the electron micro- 
scope is therefore very beneficial, reducing the delocalization effect in 
addition to contrast enhancement and damage reduction. 

ELNES analysis from single atoms is highly desirable because the rich 
information it supplies will become accessible from individual atoms at 
any local area. The ELNES fingerprinting method has been widely used 
to determine the electronic/bonding states of unknown materials by 
comparison with the reference spectra of known materials. For example, 
the chemical state of Ce** or Ce** in metallofullerene molecules has 
been clearly discriminated at the single-atom level simply by measuring 
the energy shift'’”. Here we have demonstrated the possibilities of ELNES 
spectra analysis beyond the simple fingerprinting method. Non-bulk 
atoms provide peculiar electronic structures and therefore their ENLES 
should be completely new (or previously unknown) and cannot be 
compared with any existing reference. Further efforts should be made 
to obtain the electronic state information from new ELNES spectra by 
combining atomic resolution imaging with theoretical calculations. 


METHODS SUMMARY 

STEM-EELS experiments. A JEOL 2100F transmission electron microscope with 
the DELTA corrector was operated at 60 kV (ref. 9). The energy resolution was 
around 0.4eV. We used a probe of 0.1 nm diameter with 20 pA for experiments. 
For spectroscopy, we used GIF Quantum", designed for low-voltage operations. 
The convergence angle for incident probe was set to 30 mrad, while the inner angle 
for ADF imaging was around 45-50 mrad, which is equal to the EELS collection 
angle. ELNES analysis was performed at each pixel while the incident probe 
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digitally scanned''. The spectrum-image mode, consisting of a two-dimensional 
set of ELNES spectra, takes longer for total acquisition and easily leads to the 
destruction of the specimen. Therefore we frequently used the spectrum-line 
mode, consisting of a one-dimensional set of ENLES spectra, in this study. 
Typical acquisition time is around 0.1 to 1.0 for each spectrum. A spectrum line 
consists of 100 spectra, while an image spectrum consists of typically 12 x 12 
spectra (see also Supplementary Fig. 3). 

Specimen preparation. Commercially available synthetic HOPG (NT-MDT 
Company) was used for experiments. Some of the flakes were cleaved using 
Scotch tapes and then transferred to transmission electron microscope microgrids 
following the method developed by Meyer and co-workers”. 

ELNES simulations. The first-principles calculation based on DFT theory was 
used to estimate energy levels and partial density of states on carbon atoms of 
graphene structures. In the discrete variance-X% method, the energy levels and 
partial density of states of unoccupied carbon 2p orbitals are estimated from the 
self-consistent charge calculation. To estimate the threshold energy of the carbon 
K-edge, the core-hole effect was considered by employing the transition-state 
approximation method, which configures a half-electron removed from the carbon 
1s orbital and added to an unoccupied orbital””*’. See also Supplementary Fig. 2. 
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Mantle superplasticity and its self-made demise 


Takehiko Hiraga', Tomonori Miyazaki’, Miki Tasaka' & Hidehiro Yoshida” 


The unusual capability of solid crystalline materials to deform 
plastically, known as superplasticity, has been found in metals 
and even in ceramics’. Such superplastic behaviour has been specu- 
lated for decades to take place in geological materials, ranging from 
surface ice sheets to the Earth’s lower mantle**. In materials science, 
superplasticity is confirmed when the material deforms with large 
tensile strain without failure; however, no experimental studies 
have yet shown this characteristic in geomaterials. Here we show 
that polycrystalline forsterite + periclase (9:1) and forsterite + 
enstatite + diopside (7:2.5:0.5), which are good analogues for 
Earth’s mantle, undergo homogeneous elongation of up to 500 
per cent under subsolidus conditions. Such superplastic deforma- 
tion is accompanied by strain hardening, which is well explained by 
the grain size sensitivity of superplasticity and grain growth under 
grain switching conditions (that is, grain boundary sliding); grain 
boundary sliding is the main deformation mechanism for super- 
plasticity. We apply the observed strain-grain size-viscosity rela- 
tionship to portions of the mantle where superplasticity has been 
presumed to take place, such as localized shear zones in the upper 
mantle and within subducting slabs penetrating into the transition 
zone and lower mantle after a phase transformation. Calculations 
show that superplastic flow in the mantle is inevitably accompanied 
by significant grain growth that can bring fine grained (=1 pm) 
rocks to coarse-grained (1-10 mm) aggregates, resulting in increas- 
ing mantle viscosity and finally termination of superplastic flow. 

Microstructures of some deformed rocks (such as ultramylonite) 
resemble those of experimental superplastic materials. Further, syn- 
thetic mineral aggregates that are analogues of rocks often exhibit a 
dependence of flow rate on stress and grain size similar to that found 
during superplastic deformation. On the basis of these results, super- 
plasticity has been speculated to occur in geological materials, includ- 
ing glaciers’, lower crust**, upper mantle*”, transition zone® and lower 
mantle’*. The term ‘superplasticity’ refers to tensile deformation to 
large strain without failure; however, no experimental studies have 
shown this characteristic in geomaterials. Thus, Earth scientists have 
been compelled to use superplasticity to describe creep accommodated 
mainly by grain boundary sliding (GBS)*”, which is considered to be 
the primary deformation mechanism for superplasticity’®. Grain 
switching as a result of GBS is expected to leave little evidence of its 
operation. Grains with random crystallographic orientations and 
equant shapes are characteristic of superplastically deformed samples. 
Thus, geologists have identified superplasticity (or GBS) in their col- 
lected rocks simply by eliminating other possible deformation 
mechanisms, such as dislocation and diffusion creep. 

Two types of polycrystalline aggregates—consisting of 90 vol.% 
forsterite (Mg,SiO,) and 10 vol.% periclase (MgO) (Fo+Per) and of 
70 vol.% forsterite, 25 vol.% enstatite (MgSiO;) and 5 vol.% diopside 
(CaMgSi,O,) (Fo+En+Di), which are good analogues for lower and 
upper mantle composites—were prepared by sintering of nanosized 
mineral powders (see details in Methods and ref. 11). Minimizing the 
particle size of the initial powders and introducing secondary phases, 
such as periclase and pyroxenes, facilitated the maintenance of a fine 
grain size (<500 nm) with a porosity of <0.1 vol.% for both samples. 


Tensile tests at a constant displacement rate (v = 10 °-10 *mms_’) 
were performed on the blade-shaped sintered aggregates with an 
Instron-type testing machine at 1,350-1,450 °C in air. A few compres- 
sion tests were also carried out on cylinder-shaped samples to explore 
the effect of deformation conditions such as temperature and loading 
type on the strain-grain size relationship, as discussed later. A piece of 
the same starting material was placed next to the creep sample in all 
tests to see how the microstructure changed simply through static 
annealing (we refer to this sample as a reference sample). 

Typical displacement-stress curves for our materials exhibited 
strain hardening in the initial to middle stages of the deformation 
and weakening in the final stage before failure (Supplementary Fig. 1). 
Samples with and without failure both exhibited almost homogeneous 
strain (Fig. 1). The largest elongation of 515% was achieved for a sample 
of Fo+Per at v = 1.2 10 *mms | (corresponding to an initial strain 
rate (2) of 1.0 X 10 “s ') at 1,450°C. A maximum elongation of 315% 
for a sample of Fo+En+ Di was reached at the same displacement speed 
and at 1,350 °C. Both samples clearly demonstrate superplasticity, which 
is defined by a tensile strain of >>100% (ref. 12). 

Superplasticity in the Fo+ Per system occurred over a wider range of 
temperature and strain rate than in the Fo+En+Di system; thus, we 
primarily describe deformation behaviour and microstructure of the 
Fo+Per samples in this study. Microstructures of starting, deformed 
and reference samples were observed by field-emission-gun scanning 
electron microscopy (FEG-SEM) (Fig. 2a-c). Both starting and ref- 
erence samples exhibited homogeneous (random) distribution of grains 
of secondary phase (periclase; Fig. 2a); however, static grain growth was 
detected in the reference samples, following the relationship 


di, —dj =kt (1) 


where d,, and d, are grain size under static annealing conditions and 
initial grain size, respectively, t is time and k is the growth coefficient 
(Supplementary Fig. 2). The relationship is expected to hold when an 
intergranular secondary phase effectively pins the growth of the first 
phase, and the growth of the secondary phase is rate-limited by grain 
boundary diffusion of the slowest ions'*"*. The presence of periclase 
grains at intergranular regions between forsterite grains indicates the 
operation of such a process (Fig. 2a). Grains are equigranular in the 


Figure 1 | Specimens before and after tensile deformation experiments. 

a, Starting sample of Fo+Per. b, Fot+Per sample after 515% elongation 
(v=1.2X10-°>mms ! and T= 1,450°C). c Fot En+Di sample after 315% 
elongation (v = 1.2 x 10 °>mms_! and T= 1,350 °C). 
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Figure 2 | Microstructures of reference and deformed samples, and 
schematic illustration of the deformation process. Sample surfaces were 
thermally etched at 1,250 °C for 0.5 h in air. Pairs of arrows indicate the tensile 
directions. False colours: green, forsterite; red, periclase. a, Reference sample for 
Fo+Per shown in c. b, Fo+Per sample with 84% elongation at 

v=6.0X 10 ‘mms _/. Total experimental time is 16,876 s. c, Fo+Per sample 
with 399% elongation at v= 6.0 10 *mms |. Total experimental time is 
7,984 s. d, Model of grain switching accompanied by grain coalescence 
(modified after ref. 15). 


starting and reference samples, whereas grains in the deformed samples 
were weakly elongated in the direction of tension (Fig. 2b, c). Further, a 
very weak lattice-preferred orientation (a-axis concentration parallel to 
the tensile direction) and few dislocations were detected under SEM- 
EBSD (electron back scattered diffraction) and transmission electron 
microscopy, respectively (Supplementary Figs 3 and 4); these observa- 
tions correspond to deformation characteristics expected from the 
model of superplasticity’®. Deformed samples always exhibit larger 
grain size than their reference samples (Fig. 2a, c). Further, a sample 
with larger strain but deformed for a shorter period of time exhibited a 
larger grain size than a sample with less strain deformed for longer time 
(Fig. 2b, c). 

Randomly distributed grains of secondary phase (=Per) in starting 
materials coalesced perpendicular to the tensile direction in deformed 
samples, and such structure is readily recognized in the samples 
deformed under larger strain and strain rate. The coalescence is well 
explained by grain switching of the primary phase accompanied by 
movement of grains of the secondary phase, as illustrated in Fig. 2d 
(refs 10, 15). When grain switching is fast relative to grain boundary 
migration, a coalesced structure composed of multiple grains can be 
preserved. A similar structure is identified in natural ultramylonite 
(see, for example, ref. 16). Here, we assume that grains of both phases 
can grow until they encounter grains of a different phase. As grains of 
secondary phase are isolated from one another, their growth is essen- 
tially controlled by their coalescence due to grain switching (Fig. 2d). 
In the deformed samples, the grain size of the primary phase (d;) is 
expected to increase parallel to the grain size of the secondary phase 
(dj,) (that is, dj /dj, = constant); thus, grain sizes of both phases are 
expected to follow the relationship (see details in Supplementary 
Discussions)'” 


In(d°/do) = ae (2) 


where « is a coefficient determined by the fraction of secondary phase 
grains coalesced in a single grain switching event. Here we use d as an 
average grain size without distinguishing between phases. It has been 
shown that dp can be replaced by the grain size from the reference 
samples, d,s when static grain growth is not significant’®. The Ind‘/d,er 
versus € data from the Fo+Per samples are well fitted by equation (2) 
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Figure 3 | Experimental data (Ind‘/d,.¢ versus ¢) for Fo+Per samples. See 
text for details of model. Comp., compression; tens., tension. 


with « ~ 0.3 (Fig. 3), the value commonly reported for metals and 
ceramics’®. Overall, under deformation via the GBS dominant creep 
mechanism, we are able to predict grain growth by combining equa- 
tions (1) and (2) (Supplementary Table 1). 

Rheological data were analysed on the basis of the power-law 
relationship 


s=A(o"/d?) (3) 


where é is strain rate, A is a constant, a is stress, n is the stress exponent, 
d is grain size and p is the grain size exponent. To determine flow 
parameters, the value of p was derived first. As grain growth inevitably 
occurred during the experiments, we attribute strain hardening of the 
sample to grain growth and attribute weakening to cavity formation. 
To obtain flow strength free from the effect of cavities, we analysed 
flow data only for ¢ = 1.0 before cavity formation. As described above, 
we explain the change of grain size during the tests through a com- 
bination of static and dynamic grain growth laws, which allow us to 
attribute the hardening to grain growth with p ~ 1.5. Using this value 
to normalize grain size to 1 um, we find a linear relationship between 
logo and log é, resulting in a value of n = 2.3 (see details in Methods 
and Supplementary Fig. 5), which corresponds to the value commonly 
reported from superplastic ceramic materials (that is, n ~ 2)". 
Although true tensile elongation is not likely to be important under 
geologic conditions, grain-scale deformation processes during laboratory 
tensile tests should be the same as those during mantle flow. Geological 
settings at probable temperatures where superplasticity has been pro- 
posed to take place are considered here: that is, in localized shear zones 
in the upper mantle at T~ 700°C (ref. 5), and in subducting slabs 
penetrating into the transition zone at T ~ 1,500 °C (ref. 6) and into 
the top of the lower mantle at T ~ 1,600 °C (refs 7, 8). Here we assume 
grain growth of the major phase (70 vol.%) in the upper mantle, the 
transition zone, and the lower mantle to be pinned by secondary phases 
(30 vol.%); the major and secondary phases are respectively olivine and 
pyroxene in the upper mantle, ringwoodite and majorite in the trans- 
ition zone, and Mg-perovskite and magnesiowustite (and/or Ca- 
perovskite) in the lower mantle. Under static conditions, grain growth 
can be predicted by equation (1) by estimating k from numerous 
parameters, including grain boundary diffusivity of the slowest ions 
(Supplementary Discussions)'*"*. On the basis of Si diffusivities for 
each mineral and temperature’ ~’, the value of k for the three mantle 
settings varies by only one order of magnitude. For the sake of simplicity, 
we use k = 10 °*'m*s ‘asa representative value for all three settings. If 
we start the aggregates with a grain size of 1 1m, corresponding to the 
minimum grain size observed in the shear zone’ and the predicted value 
after a phase transformation during subduction”, grain size will follow 
the black solid line in Fig. 4; this gives ~30 um and ~200 um after 
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Figure 4 | Predicted grain size and normalized viscosity as a function of 
time under static and dynamic conditions applicable to three different 
mantle settings. Settings are: localized shear zone in upper mantle; subducting 
slab penetrating into transition zone; and subducting slab penetrating into the 
top of lower mantle. We assume dp = 1 jm, Ind*/d,, = 0.3y/2 and p = 2 
(equation (1)). Black solid line, static grain growth; coloured solid lines, 
dynamic grain growth with different strain rates from 10 1° to 10 °s 1. 
Dashed lines, previously predicted static grain growth at the top of lower 
mantle**”*. 9, viscosity when d = 1 jum. See details in text and Supplementary 
Information. 


10 Myr and 1 Gyr, respectively. For the case of the top of the lower 
mantle, the experimental result** and previous estimates using a grain 
growth law for a volume diffusion mechanism’, both of which are 
represented by dashed lines in the figure, predict ~20,1m and 
~400 um, respectively, after 1 Gyr. Although estimation of static grain 
growth contains significant uncertainties, pinning due to the secondary 
phase is so effective overall that grain growth under static conditions is 
too slow to reach a realistic grain size of 1-10 mm in the mantle settings 
considered here. Under dynamic conditions, grain growth can be pre- 
dicted by a combination of equations (1) and (2) by imposing é. When 
the rocks deform at an é of 10 '°-10~'°s” |, the strain effect on growth 
rate becomes significant between 10° and 10° years, depending on strain 
rate (coloured lines in Fig. 4). Accordingly, mantle viscosity increases 
dramatically. 

Microstructures of a peridotite mylonite from a shear zone, as com- 
pared to the deformation mechanism map for olivine aggregates, indi- 
cate that diffusion, GBS and dislocation creep took place in domains 
with grain sizes of 1-10 tum, 10-300 um and >300 pum, respectively>”®. 
Displacement at the shear zone does not contribute to a change in 
creep regime from diffusion to GBS creep, whereas a change from GBS 
to dislocation creep can occur through deformation-induced grain 
growth (Fig. 4). Assuming a shear strain rate (j) of 10~ 25"! the 
change from GBS to dislocation creep can occur after ~5 X 10° years 
corresponding to a shear strain of ~15, indicating that the domain 
deforming via a grain-size sensitive mechanism cannot survive after 
such deformation in the shear zone. 

When a subducting slab of thickness 200 km penetrates into the 
transition zone or lower mantle at 5 cm yr — 1 the shear strain rate inside 
the slab will be ~10*s” ', assuming simple shear geometry. Although 
numerous assumptions are involved, the boundary between diffusion 
and dislocation creep is considered to lie at a grain size of the order of 
1 tm in the transition zone”’. In this case, the slab deforms via disloca- 
tion creep immediately after the transformation. A grain size of ~3 mm 
in the top of the lower mantle is required to explain geophysically 
estimated viscosity with experimentally obtained Si diffusivity’’. As 
shown, such a large grain size is never attained under static conditions; 
however, under dynamic conditions, this size can be attained in 
~6 X 10’ years following the transformation, which corresponds to 
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~3,000 km horizontal movement of the slab. Our calculation demon- 
strates that deformation-induced grain growth controls grain size, vis- 
cosity and the extent (in space and time) of superplasticity in the 
mantle. 


METHODS SUMMARY 


Mineral powders for sintering were prepared through solid state reactions between 
nanosized powders of Mg(OH), and colloidal SiO, with and without CaCO3. Such 
powders were then cold pressed under isostatic pressure into bars. Subsequently, 
they were vacuum sintered to obtain very dense fine-grained materials. The blade- 
shaped sintered aggregates for tensile experiments were machined to a gauge 
length of 12 mm, a width of 2mm, and a thickness of 2 mm (Fig. 1a). For creep 
experiments, we used an Instron 5580 uniaxial mechanical testing machine with a 
furnace to heat the samples. The samples were held by SiC rods, and testing 
temperatures were established by raising the temperature at 650°Ch_’. All tests 
were conducted in a temperature range of 1,350-1,450 °C and under atmospheric 
conditions. Constant displacement rates were established in all the tests. Tensile 
strain was determined from the crosshead displacement, assuming uniform 
elongation in the gauge portion. After the tests, all samples were polished in the 
plane parallel to the tension direction and thermally etched to expose grain and 
interphase boundaries. The diameter of each grain was measured by approximat- 
ing the grain shape to a circle with imaging software. More than 170 grains were 
measured in each sample to obtain the mean diameter of the circles, which should 
represent grain size in the sample. 


Full Methods and any associated references are available in the online version of 
the paper at www.nature.com/nature. 
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METHODS 


Mineral powders for sintering were prepared through solid state reactions among 
nano-sized powders of Mg(OH),, colloidal SiO. and +CaCO3 at ~1,000°C. 
Calcined powders were cold pressed under an isostatic pressure of 200 MPa into 
bars of ~5 X 10 X 30mm. Subsequently, they were vacuum sintered at 1,330- 
1,350 °C to obtain very dense fine-grained materials. We have reported the details 
of these procedures elsewhere’’**. Average grain sizes before deformation experi- 
ments were 280 and 480 nm for the samples of Fo+ Per and Fo+ En+ Di, respec- 
tively. The blade-shaped sintered aggregates for tensile experiments were 
machined to a gauge length of 12mm, a width of 2mm, and a thickness of 
2mm (Fig. la). A cylindrical furnace with heating elements of Kanthal Super 
was used to heat the samples. The furnace was attached to an Instron 5580 uniaxial 
mechanical testing machine. The samples were held by SiC rods consisting of 2 to 3 
parts with flexible joints so that the sample could be adjusted to tensile geometry 
after a small amount of displacement. Testing temperatures were established by 
raising the temperature at 650°Ch ’. All tests were conducted under constant 
displacement rate (v= 10 *-10 *mms_!). Tensile strain was determined from 
the crosshead displacement by considering the compliance of the apparatus and by 
assuming uniform elongation in the gauge portion. As achievable displacement 
was so large, nominal strain can be significantly different from true strain. Thus, 
we use true strain (¢) instead of nominal strain when discussing the strain effect on 
creep characteristics. We collected one force-displacement-time datum per 
1,000-2,000 ms using a load cell attached to the crosshead of the Instron machine. 
We read stress and strain rate at 0.2 < ¢ < 1 (Supplementary Table 1). Within this 
range, we are able to obtain reliable stress and strain rate data, which are free from 
the effects of frictional and elastic behaviours of the samples and sample holders 
and from the effect of cavitation on sample strength. Half of the tests were con- 
ducted until sample failure. Experiments at higher temperature and slower dis- 
placement rate tended to achieve larger strain before failure. Compression tests 
were conducted on cylinder-shaped samples (~5 mm radius, 10 mm length) at 
v=(1.7 X 10~“)-(1.3 X 10-7) mms! at 1,200 and 1,300°C. All samples were 
quenched by ~20°C min ' to preserve deformation microstructure. 
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After the tests, all the samples were polished in the plane parallel to the tension 
direction and thermally etched at temperatures more than 100 °C lower than that used 
for the experiments for <0.5h in air to expose grain and interphase boundaries'’”* 
(Fig. 2 and Supplementary Fig 6). Microstructural change during the thermal etching 
was confirmed to be negligible by microstructural observations of the samples that 
were chemically etched with dilute HCl + HNO3. We did not apply any etching 
techniques to the samples for TEM observations. We measured the diameter of each 
grain by approximating the grain shape to a circle with imaging software. The mean 
diameter of the circles should represent grain size in the sample, when grain shape is 
essentially equiaxed. More than 170 grains were measured in each sample. 

As we are able to reproduce grain growth with grain growth laws for both static 
and dynamic conditions (equations (1) and (2)), we can estimate grain size at any 
point of the deformation when experimental time and strain are known (we refer to 
this grain size as d") (Supplementary Table 1). Such grain size is used to extract flow 
parameters such as grain size and stress exponents in equation (1) from mechanical 
data. Details of this procedure follow. Final (total) strain of the samples, é™ and 
final grain size, d‘", are substituted into equation (2) to obtain « for each sample. 
Grain size versus time for reference samples of Fo+Per annealed at 1,450 °C is 
shown in Supplementary Fig. 2. The best fit indicates a grain growth exponent of 
~4, which corresponds to the predicted exponent in equation (1). The best fit 
relationship of static grain growth allows us to predict the grain size de at t which 
corresponds to the time when the deformed sample reached «¢. With calculated « 
and d,s we are able to predict grain size in the deformed samples (d°d*) using 
equation (2). Values used for ¢, t and d° are all listed in Supplementary Table 1. 
Within a small range of strain, é can be approximated as a constant so that grain size 
exponent p is obtained from strain hardening (that is, analysis of @lna/d Ind’), 
giving p ~ 1.5. Once p is fixed, é is normalized to the value for a grain size of 
d=1 um by éx(d*/d)'*. Finally, we plot stress (logo) versus normalized strain 
rate (logé x (d*)'°) as shown in Supplementary Fig. 5. Linear relationships between 
loge and log é were found, resulting in values of n = 2.3 (Supplementary Fig. 5). 


28. Hiraga, T., Tachibana, C., Ohashi, N. & Sano, S. Grain growth systematics for 
forsterite + enstatite aggregates: effect of lithology on grain size in the upper 
mantle. Earth Planet. Sci. Lett 291, 10-20 (2010). 
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Medulloblastoma encompasses a collection of clinically and mol- 
ecularly diverse tumour subtypes that together comprise the most 
common malignant childhood brain tumour’. These tumours are 
thought to arise within the cerebellum, with approximately 25% 
originating from granule neuron precursor cells (GNPCs) after 
aberrant activation of the Sonic Hedgehog pathway (hereafter, 
SHH subtype)* *. The pathological processes that drive heterogeneity 
among the other medulloblastoma subtypes are not known, hinder- 
ing the development of much needed new therapies. Here we provide 
evidence that a discrete subtype of medulloblastoma that contains 
activating mutations in the WNT pathway effector CTNNB1 (here- 
after, WNT subtype)'** arises outside the cerebellum from cells of 
the dorsal brainstem. We found that genes marking human WNT- 
subtype medulloblastomas are more frequently expressed in the 
lower rhombic lip (LRL) and embryonic dorsal brainstem than in 
the upper rhombic lip (URL) and developing cerebellum. Magnetic 
resonance imaging (MRI) and intra-operative reports showed that 
human WNT-subtype tumours infiltrate the dorsal brainstem, 
whereas SHH-subtype tumours are located within the cerebellar 
hemispheres. Activating mutations in Ctnnb1 had little impact on 
progenitor cell populations in the cerebellum, but caused the abnor- 
mal accumulation of cells on the embryonic dorsal brainstem which 
included aberrantly proliferating Zicl* precursor cells. These 
lesions persisted in all mutant adult mice; moreover, in 15% of cases 
in which Tp53 was concurrently deleted, they progressed to form 
medulloblastomas that recapitulated the anatomy and gene expres- 
sion profiles of human WNT-subtype medulloblastoma. We provide 
the first evidence, to our knowledge, that subtypes of medulloblas- 
toma have distinct cellular origins. Our data provide an explanation 
for the marked molecular and clinical differences between SHH- 
and WNT-subtype medulloblastomas and have profound implica- 
tions for future research and treatment of this important childhood 
cancer. 

SHH-subtype medulloblastoma is characterized by aberrant SHH 
signalling that is often driven by inactivating mutations in PTCH1**. 
These medulloblastomas tend to arise in very young children, display a 
‘large cell-anaplastic’ or “desmoplastic’ histology and have a relatively 
poor prognosis” *. WNT-subtype medulloblastomas are strikingly dif- 
ferent. Arising in much older children, these highly curable tumours 
have ‘classic’ morphology and activating mutations in CTNNBI'*. 
Mouse models have shown that SHH-subtype medulloblastomas arise 


from committed GNPCs of the cerebellum’* and enabled the develop- 
ment of new therapies that suppress the oncogenic SHH-signal”"®. It has 
been suggested that the other medulloblastoma subtypes might have a 
different cellular origin*'"”’, but little is known about their biology and 
there are no mouse models of these tumours. 

Recently, we showed that subtypes of the brain tumour ependymoma 
arise from discrete populations of neural progenitor cells with which 
they share similar gene expression profiles’’. Therefore, to determine if 
medulloblastoma subtypes also arise from discrete cell populations, we 
first used four online gene expression databases to chart the regional 
expression of 110 genes that mark human SHH- or WNT-subtype 
medulloblastomas’. Twenty-four WNT-subtype and 25 SHH-subtype 
medulloblastoma signature genes are contained within “Brain Explorer 
2’, which generates three-dimensional gene expression maps across the 
mouse brain (www.brain-map.org, Supplementary Methods and 
Supplementary Data set 1). As expected’, these data revealed the 
URL at embryonic day (E) 11.5 and the cerebellum at E15.5 to be the 
most common sites of SHH-subtype signature gene expression (Fig. la, 
b and Supplementary Data Set 1). In contrast, WNT-subtype medullo- 
blastoma signature genes were predominantly expressed within the 
LRL at E11.5 (rhombomeres (r) 2-r8) and the dorsal brainstem at 
E15.5. Expression of an additional 61 medulloblastoma signature genes, 
reported by three other online databases, confirmed this differential 
pattern (Supplementary Fig. 1 and Supplementary Table 1). These data 
suggest that SHH- and WNT-subtype medulloblastomas arise from 
distinct regions of the hindbrain and identify the dorsal brainstem as 
a potential source of WNT-subtype tumours. 

If SHH- and WNT-subtype medulloblastomas have different origins, 
we reasoned that these tumours should demonstrate anatomical differ- 
ences at diagnosis. Remarkably, all validated WNT-subtype medullo- 
blastomas examined (n = 6/6, Supplementary Fig. 2) were located 
within the IV ventricle and infiltrated the dorsal surface of the brain- 
stem, whereas all SHH-subtype tumours (n = 6/6) were distributed 
away from the brainstem within the cerebellar hemispheres (Fig. 1c, d 
and Supplementary Fig. 3, exact Mann-Whitney P < 0.005). Five of the 
six WNT-subtype, but no SHH-subtype, tumours were adherent to the 
dorsal brainstem at surgery (Fisher’s exact test, P< 0.005). Thus WNT- 
subtype medulloblastomas are anatomically distinct from SHH tumours 
and are intimately related to the IV ventricle and dorsal brainstem. 

We noted various cell types surrounding the IV ventricle that 
could give rise to WNT-subtype medulloblastomas, including dorsal 
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brainstem progenitors of cochlear, mossy-fibre and climbing-fibre 
neurons (Fig. 1a, b and Supplementary Fig. 4)’. But it remained pos- 
sible that cerebellar ventricular-zone radial glia’*’® or GNPCs generate 
WNT-subtype medulloblastomas. To identify hindbrain cells that are 
susceptible to transformation by Ctnnb1, we generated mice carrying a 
cre-dependent mutant allele of Ctnnb1 (Ctnnb1'°)”” and the Blbp- 
Cre transgene’®. Blbp-Cre induces efficient recombination in progenitor 
cell populations across the hindbrain including the cerebellar ventricular 
zone, GNPCs of the external germinal layer (EGL) and Olig3* progeni- 
tor cells in the LRL’’ (Supplementary Fig. 5). We also generated Blbp- 
Cre‘! ;Ctnnb1*/’“) (hereafter, Ctnnbl-mutant) mice that were 
homozygous for a cre-dependent mutant allele of Tp53 (Tp53)”° 
because loss of this tumour suppressor accelerates medulloblastoma 
formation in Ptch1*’~ mice”. As expected, Ctnnb1-mutant embryos 
expressed mutant nuclear-Ctnnb1 in all hindbrain germinal zones, 
regardless of Tp53 status (Supplementary Figs 5k and 6). Surprisingly, 
mutation of Ctnnb1 did not affect significantly the proliferation or 
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Figure 1 | WNT and SHH subtypes of medulloblastoma are anatomically 
distinct. a, b, Expression distribution in (a) E11.5 and (b) E15.5 mouse 
hindbrain of orthologues that distinguish human WNT- and SHH-subtype 
medulloblastoma (Supplementary Data Set 1). Cartoons in b denote the 
position of rhombomeres relative to the cerebellum and brainstem. c, Top, pre- 
operative, and bottom, post-operative, MRI scans of exemplary SHH- and 
WNT-subtype medulloblastomas. Right panels show close-up views of left 
panels. Brainstem (BSt), post-operative tumour cavity (cvt). d, Frequency and 
site of post-operative surgical cavities of SHH- (n = 6) and WNT- (n = 6) 
subtype medulloblastomas. Axial (left) and sagittal (right) views are shown. 
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apoptosis of ventricular-zone cells or GNPCs in the cerebellum 
(Fig. 2a and Supplementary Fig. 7). 

Because GNPCs generate SHH-subtype medulloblastomas”*, we 
sought additional evidence that these cells are not impacted by mutant 
Ctnnbl. First, we generated Atoh1 -Cre*’~;Ctnnb1 +") mice 
because Atoh1-Cre drives efficient recombination in GNPCs, generat- 
ing medulloblastomas in conditional Ptch1 mice (see Supplementary 
Fig. 8a-j and ref. 7). We also used the Atoh1 enhancer element present 
in the Atoh1-Cre allele to drive expression of a constitutively active 
Ctnnb1-green fluorescence fusion protein (GFP) in GNPCs (Atoh1- 
Ctnnb14N7°S"*, Supplementary Fig. 8k-o)”. Neither Atoh1-Cre*’; 
Ctnnb1*"°*) nor Atoh1-Ctnnb 148°"? mice (more than 20 mice 
examined each) developed hyperplasia or masses within the URL or 
EGL. Concordantly, aberrant Ctnnb1 signalling did not impact the 
proliferation of GNPCs ex vivo (Supplementary Fig. 8p). Thus, in 
contrast to aberrant Shh signalling, mutant Ctnnb1 does not appear 
to disrupt cell-cycle or differentiation control in GNPCs. 

In stark contrast to the cerebellum, by E16.5 all Ctnnb1-mutant mice 
developed aberrant cell collections in the dorsal brainstem that per- 
sisted into adulthood (exact Mann-Whitney P< 0.005, Fig. 2a-f). 
These cells were marked by Olig3 and Pax6, which suggested they 
may be derived from progenitor cells within the LRL'’”’ (Fig. 2d, e). 
This abnormality was independent of Tp53 status and did not involve 
the floor plate that is not targeted by Blbp-Cre (Supplementary Fig. 9). 
Progenitors within the embryonic dorsal brainstem proliferate to 
produce daughter cells that express specific marker proteins and follow 
complex migration streams to their respective nuclei in the developing 
brainstem (Supplementary Fig. 4)'°. We observed no significant 
differences in the overall proliferation (Ki67 labelling), apoptosis 
(TdT-mediated dUTP nick end labelling) or cell-cycle duration (5- 
bromo-2-deoxyuridine pulse-chase) of progenitors in the dorsal brain- 
stem of Ctnnb1-mutant versus control mice (Fig. 2c, data not shown). 
However, a significant fraction of proliferating cells within Ctnnb1- 
mutant dorsal brainstems expressed Zicl (37% Zicl* /Ki67* = 122/ 
322; Fig. 2c, f-h). This expression is aberrant because Zicl normally 
marks postmitotic mossy-fibre neuron precursors as they exit the dor- 
sal brainstem to form nuclei in the ventral brainstem” (Fig. 2g). Thus 
mutant Ctnnb1 might stall the dorso-ventral migration of brainstem 
neuron precursors, resulting in aberrant dorsal cell collections”®. To test 
this, we used in utero GFP electroporation to track the fate of embry- 
onic dorsal brainstem precursors (Fig. 2i-q and Supplementary Figs 10 
and 11). GFP-labelled Zicl* mossy-fibre neuron precursors under- 
went normal migration from the dorsal brainstem to the pontine grey 
nucleus and other brainstem nuclei in control mice (Fig. 2k-n and 
Supplementary Fig. 11). In contrast, mutation of Ctnnb1 markedly 
reduced the numbers of precursors transiting from the dorsal brain- 
stem to the pontine grey nucleus (Fig. 20-q; exact Mann-Whitney, 
P<0.05). Together, these data demonstrate that mutant Ctnnb1 dis- 
rupts the normal differentiation and migration of progenitor cells on 
the dorsal brainstem, resulting in the accumulation of aberrant cell 
collections. These cells may include stalled mossy-fibre neuron precur- 
sors, but further work is required to determine their precise lineage. 

Aberrant cell collections in the dorsal brainstem of Ctnnb1-mutant 
mice are reminiscent of the EGL hyperplasia that precedes formation of 
SHH-subtype medulloblastoma in the cerebellum of Ptch1-deficient 
mice*’. Therefore we aged Ctnnb1-mutant mice harbouring Tp53*’*, 
Tp53*"™ or Tp53"™ alleles to test if WNT-subtype medulloblastomas 
might arise from the dorsal brainstem (n more than 54 mice per geno- 
type). Aberrant cell collections persisted throughout adulthood on the 
dorsal brainstem of all Ctnnb1 -mutant;Tp53"’ * mice, but these animals 
did not develop medulloblastoma or tumours in any part of the 
hindbrain (median follow up 365 days). In contrast, 2 out of 10 
Ctnnb1-mutant;Tp53”™ mice aged older than 6 months harboured 
asymptomatic tumours that were confined to the dorsal brainstem (Sup- 
plementary Fig. 12). When aged for longer periods, 15% (n = 8/55) of 
Ctnnb1-mutant;Tp53™ MX and 4% (n = 2/54) Ctnnb1-mutant;Tp53°™ 
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Figure 2 | Mutant-Ctnnb1 causes aberrant accumulation of LRL cells. 

a, Low- (scale bar, 180 tm) and b, high- (scale bar, 50 um) power views of LRL/ 
dorsal brainstem in Ctnnb1 mutant and wild-type E16.5 embryos; b includes 
the corresponding adult brainstem region. c, Volume and indicated 
immunoreactivity differences between Cinnb1I-mutant and wild-type LRL 

(n = 3 mice per group; bars, mean + s.d.). Immunofluorescence of Olig3 

(d), Pax6 (e) and Zicl (f) in Ctnnb1-mutant E16.5 LRL (left) and aberrant adult 
dorsal brainstem masses (right) (scale bar, 180 1m). Inset, high-power views of 
“ (scale bar, 5 jim). g, Postmitotic mossy-fibre precursor neurons 


mice developed ‘classic’ medulloblastomas that were Zicl* and con- 
tained populations of nuclear-Ctnnb1*/Olig3* cells (median follow 
up 290 and 287 days, respectively; Fig. 3a-d). These mouse medullo- 
blastomas displayed an immunoprofile similar to human WNT-subtype 
tumours and were invariably connected with the brainstem (Fig. 3d, e 
and Supplementary Fig. 13). In contrast, mouse models of human SHH- 
subtype medulloblastoma*’”””* are nuclear-Ctnnb1 negative, arise 
within the cerebellum and do not invade the brainstem (Fig. 3d, e). 
Together, these data support the hypothesis that progenitor cells within 
the dorsal brainstem are susceptible to transformation by concurrent 
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(Zicl */Ki67_ ) exit the proliferating E16.5 control LRL. h, Ctnnbl-mutant LRL 
contains aberrant proliferating Zicl* precursors (arrows; scale bar, 50 um). 

i, GFP-electroporated wild-type LRL marks Olig3” cells (j) and migrating 
precursors (arrows in i) that include Zic1* mossy-fibre neurons (k) that form 
the pontine grey nucleus (1). GFP-fluorescence of whole (m, o) and sectioned 
(n, p) Ctnnb1-mutant and wild-type PO hindbrains electroporated at E12.5. 
q, Mean + s.d. of LRL/pontine grey nucleus GFP fluorescence in whole 
hindbrains of three BIBp-Cre;Ctnnb1 */* and five Blbp-Cre;Ctnnb1 MORES 
mice (graphs; *P =< 0.05, **P < 0.005, exact Mann-Whitney). 


mutation in Ctnnb1 and Tp53, resulting in the formation of tumours that 
mimic the anatomical features of human WNT-subtype medulloblas- 
toma. Deletion of Tp53 is presumably required to allow key second 
mutations during transformation of the LRL in Ctnnb1-mutant mice. 
Notably, we have observed two cases of TP53-mutant human WNT- 
subtype medulloblastoma, suggesting this gene also suppresses these 
tumours in humans (Supplementary Fig. 14). 

To test further the fidelity of Ctnnb1-mutant mouse medulloblas- 
toma as a model of human WNT-subtype disease, we compared the 
tumour transcriptomes in the two species using an algorithm we have 
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Figure 3 | Mutant-CtnnbI and SHH-subtype mouse medulloblastomas are 
anatomically distinct. a, Tumour-free survival of SHH-subtype 
medulloblastoma mouse models (Nes-Cre*/ “Liga 3Tp53 ~~ Nes- 

Cre“! ;Xreca™"",T p53, Ptch1*'~sInk4c /~, Ptch1 p53, data from 
refs 14, 27, 28) and Ctnnb1-mutant;Tp53™ “ and Ctnnb1-mutant; Tp53 +a mice, 
**T og rank P< 0.0001. Immunofluorescence of (b) Zicl and (c) Olig3 and 
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Ctnnb1 expression in a Ctnnb1-mutant;Tp53"™ f< medulloblastoma. 

d, Haematoxylin and eosin-stained low- (i, v; scale bar, 800 jum) and high- (ii, vi; 
scale bar, 25 Lum) power views of mouse medulloblastomas and tumour-brainstem 
interface (iii, vii; scale bar, 50 tm). Ctnnb1 immunostaining (iv, viii; scale bar, 

10 jim, arrows indicate nuclear immunoreactivity). Boxes indicate location of 
high-power views. e, Frequency and anatomical site of mouse medulloblastomas. 
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Figure 4 | Mutant-Ctnnb1 mouse medulloblastomas recapitulate the 
molecular characteristics of human WNT-subtype disease. a, AGDEX 
comparison of Ctnnb1-mutant;Tp53" x/fl< mouse medulloblastoma, and mouse 
EGL, E16.5 dorsal brainstem (DBS) and human medulloblastoma subgroups. 
b, Unsupervised clustering of human WNT- and SHH-subtype 
medulloblastoma signature orthologue expression in E16.5 DBS, Ctnnb1- 
mutant; Tp53"™ mouse medulloblastoma (Ctnnb1 MB), P7 GNPCs and 
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developed for cross-species genomic comparisons'*. Remarkably, the 
transcriptome (n = 11,049 orthologues) of Ctnnb1-mutant;Tp53"" 
medulloblastomas matched only human WNT-subtype medulloblas- 
toma and the cells of the embryonic dorsal brainstem (both permuted 
P<0.05), validating it as a model of this human tumour subtype and 
further pinpointing the brainstem as the source of WNT-subtype 
medulloblastomas (Fig. 4a, b). Finally, because human WNT-subtype 
medulloblastomas selectively delete chromosome 6 (ref. 3), we looked 
in Ctnnb1-mutant mouse medulloblastomas to see if syntenic regions 
of this chromosome are deleted (Fig. 4c). DNA microarray analysis 
identified a single common deletion of mouse chromosome 17 3.2 cM/ 
human 6q25.3 in tumours in the two species. This locus encodes a 
single gene, TULP4, that is a distant member of the tubby-gene family 
implicated in regulating neuronal cell apoptosis”. Thus Ctnnb1- 
mutant;Tp53™ mouse medulloblastomas accurately model the 
molecular characteristics of human WNT-subtype tumours and pin- 
point TULP4 as a novel candidate suppressor gene of these tumours. 
By demonstrating that subtypes of medulloblastoma have distinct cel- 
lular origins, our data should significantly accelerate the hunt for 
curative treatments of these diseases, which must now account for 
the different developmental origins of these tumours. 


METHODS SUMMARY 

MRI analysis. MRI images of patients were spatially normalized into a standard 
stereotaxic space for quantitative comparison of tumour location (SPM5; www.fil. 
ion.ucl.ac.uk/spm). Radiologists masked to patient subtype determined the three- 
dimensional location of the tumour or surgical cavity relative to pre-defined ana- 
tomical landmarks. 

Expression mapping. The expression of mouse orthologues of key signature 
genes of human WNT- and SHH-subtype medulloblastoma (Supplementary 
Data Set 1 and Supplementary Table) were mapped in the developing mouse 
hindbrain using four publically accessible data sets (see Supplementary Methods). 
Mouse studies. Blbp-Cre, Ctnnb 1°99), Atoh1-Cre and Tp53"™ mice 
were bred to generate appropriate genotypes and subjected to clinical surveillance 
for signs of tumour development. RosaYFP and RosaLacz reporter strains traced 
the lineage of Cre-recombined cells. Mouse tumours comprised at least 85% 
tumour cells. Atoh1-Ctnnb1“” transgenic mice were generated by pro-nuclear 
injection. In utero electroporation and cell tracking were performed by anaesthet- 
izing pregnant mice of the appropriate genotype. The uterus was externalized and 
the dorsal brainstem of E12.5 embryos electroporated with CMV-eGFP plasmid 
DNA. GNPCs for culture studies were isolated from postnatal day 7 Atoh1-GFP 
transgenic mice. GFP* cells (2 X 10° per well) were cultured in poly-p-lysine- 
coated 96-well plates and challenged with mutant-Ctnnb1-GFP, control GFP virus, 
Wntl protein (50 ng ml ') or Shh supernatant (3 Lg ml ') before pulsing with 
{methyl-3H]thymidine and scintillation counting. 

Histology, messenger RNA and DNA microarray profiling. Immuno- 
histochemistry was performed using routine techniques and primary antibodies 
of the appropriate tissues as described (Supplementary Methods). Cells under- 
going apoptosis were detected with the Apoptag kit (Millipore, $7100). Messenger 
RNA expression (GEO accession number GSE24628) and DNA copy number pro- 
files (available at http://stjuderesearch.org/site/authors/gilbertson) were generated 
from mouse and human tissues using appropriate microarray platforms as detailed 
(Supplementary Methods). Reverse transcriptase real-time PCR and gene re- 
sequencing of human medulloblastomas were performed as described previously’. 
Messenger RNA expression and DNA microarray profiles of human and mouse 
medulloblastomas were integrated using established and novel bioinformatic and 
statistical approaches. 


Received 26 September 2009; accepted 15 October 2010. 
Published online 8 December 2010. 


1. Ellison, D. W. et al. B-Catenin status predicts a favorable outcome in childhood 
medulloblastoma: the United Kingdom Children’s Cancer Study Group Brain 
Tumour Committee. J. Clin. Oncol. 23, 7951-7957 (2005). 

2. Gajjar, A. et al. Risk-adapted craniospinal radiotherapy followed by high-dose 
chemotherapy and stem-cell rescue in children with newly diagnosed 
medulloblastoma (St Jude Medulloblastoma-96): long-term results from a 
prospective, multicentre trial. Lancet Oncol. 7, 813-820 (2006). 

3. Thompson, M. C. et al. Genomics identifies medulloblastoma subgroups that are 
enriched for specific genetic alterations. J. Clin. Oncol. 24, 1924-1931 (2006). 

4. Kool, M. etal. Integrated genomics identifies five medulloblastoma subtypes with 
distinct genetic profiles, pathway signatures and clinicopathological features. 
PLoS ONE 3, e3088 (2008). 


LETTER 


5. Gilbertson, R. J. & Ellison, D. W. The origins of medulloblastoma subtypes. Annu. 
Rev. Pathol. 3, 341-365 (2008). 

6. Goodrich, L. V., Milenkovic, L., Higgins, K.M. & Scott, M. P. Altered neural cell fates and 
medulloblastoma in mouse patched mutants. Science 277, 1109-1113 (1997). 

7. Schuller, U. et al. Acquisition of granule neuron precursor identity is a critical 
determinant of progenitor cell competence to form Shh-induced 
medulloblastoma. Cancer Cel! 14, 123-134 (2008). 

8. Yang, Z.J.etal. Medulloblastoma can be initiated by deletion of Patched in lineage- 
restricted progenitors or stem cells. Cancer Cell 14, 135-145 (2008). 

9. Romer, J.T. etal. Suppression of the Shh pathway using a small molecule inhibitor 
eliminates medulloblastoma in Ptcl*’~p53-’~ mice. Cancer Cell 6, 229-240 
(2004). 

10. Rudin, C.M. etal. Treatment of medulloblastoma with Hedgehog pathway inhibitor 
GDC-0449. N. Engl. J. Med. 2, 1173-1178 (2009). 

11. Louis, D., Ohgaki, H., Wiestler, O. & Cavenee, W. (eds) World Health Organization 
Classification of Tumours of the Central Nervous System (International Agency for 
Research on Cancer, 2007). 

12. Huang, X. et al. Transventricular delivery of Sonic hedgehog is essential to 
cerebellar ventricular zone development. Proc Nat! Acad. Sci. USA 107, 
8422-8427 (2010). 

13. Johnson, R.A. et al. Cross-species genomics matches driver mutations and cell 
compartments to model ependymoma. Nature 466, 632-636 (2010). 

14. Lee, Y. et al. A molecular fingerprint for medulloblastoma. Cancer Res. 63, 

5428-5437 (2003). 

5. Ray, R. S. & Dymecki, S. M. Rautenlippe Redux—toward a unified view of the 
precerebellar rhombic lip. Curr. Opin. Cell Biol. 21, 741-747 (2009). 

6. Morales, D. & Hatten, M. E. Molecular markers of neuronal progenitors in the 
embryonic cerebellar anlage. J. Neurosci. 26, 12226-12236 (2006). 

7. Harada, N. etal. Intestinal polyposis in mice with a dominant stable mutation of the 
beta-catenin gene. EMBO J. 18, 5931-5942 (1999). 

8. Hegedus, B. et a/. Neurofibromatosis-1 regulates neuronal and glial cell 
differentiation from neuroglial progenitors in vivo by both cAMP- and Ras- 
dependent mechanisms. Cel! Stem Cell 1, 443-457 (2007). 

9. Storm, R. et al. The bHLH transcription factor Olig3 marks the dorsal 
neuroepithelium of the hindbrain and is essential for the development of 
brainstem nuclei. Development 136, 295-305 (2009). 

20. Jonkers, J. et al. Synergistic tumor suppressor activity of BRCA2 and p53 ina 

conditional mouse model for breast cancer. Nature Genet. 29, 418-425 (2001). 

21. Wetmore, C., Eberhart, D. E. & Curran, T. Loss of p53 but not ARF accelerates 
medulloblastoma in mice heterozygous for patched. CancerRes. 61, 513-516(2001). 

22. Chenn, A. & Walsh, C. A. Regulation of cerebral cortical size by control of cell cycle 
exit in neural precursors. Science 297, 365-369 (2002). 

23. Landsberg, R. L. et al. Hindbrain rhombic lip is comprised of discrete progenitor 
cell populations allocated by Pax6. Neuron 48, 933-947 (2005). 

24. DiPietrantonio, H. J. & Dymecki, S. M. Zicl levels regulate mossy fiber neuron 
position and axon laterality choice in the ventral brain stem. Neuroscience 162, 
560-573 (2009). 

25. Farago, A. F., Awatramani, R. B. & Dymecki, S. M. Assembly of the brainstem 
cochlear nuclear complex is revealed by intersectional and subtractive genetic fate 
maps. Neuron 50, 205-218 (2006). 

26. Oliver, T. G. et al. Loss of patched and disruption of granule cell development ina 
pre-neoplastic stage of medulloblastoma. Development 132, 2425-2439 (2005). 

27. Uziel, T. et al. The tumor suppressors Ink4c and p53 collaborate independently 
with Patched to suppress medulloblastoma formation. Genes Dev. 19, 
2656-2667 (2005). 

28. Frappart, P.O. eta/. Recurrent genomic alterations characterize medulloblastoma 
arising from DNA double-strand break repair deficiency. Proc. Natl Acad. Sci. USA 
106, 1880-1885 (2009). 

29. |keda,A.,|keda, S., Gridley, T., Nishina, P.M. & Naggert, J.K. Neural tube defects and 
neuroepithelial cell death in Tulp3 knockout mice. Hum. Mol. Genet. 10, 
1325-1334 (2001). 


Supplementary Information is linked to the online version of the paper at 
www.nature.com/nature. 


Acknowledgements R.J.G. holds the Howard C. Schott Research Chair from the Malia’s 
Cord Foundation, and is supported by grants from the National Institutes of Health 
(RO1CA129541, PO1CA96832 and P30CA021765), the Collaborative Ependymoma 
Research Network and by the American Lebanese Syrian Associated Charities. We are 
grateful to A. Chenn, J. Johnson and C. Birchmeier for their gifts of reagents and the staff 
of the Hartwell Center for Bioinformatics and Biotechnology and ARC at St Jude 
Children’s Research Hospital for technical assistance. 


Author Contributions RJ.G. conceived the research and planned experiments. P.G. also 
planned and conducted most of the experiments. Y.T., G.R., D.S.C., M.C.T,, T.H., H.P., J.M., 
JC.L, Y.L, F.Z., C.E., S.C.C., M.F.R., PJ.M. and R.W.-R. conducted experiments. D.F. and 
S.P. provided bioinformatic expertise. A.G., F.A.B. and RAS. provided clinical advice and 
tumour samples. D.H.G. provided the B/bp-Cre mouse and data. M.M.T. provided the 
Ctnnb1!%@8”XS) mouse. Z.P. and R.O. reviewed and analysed the human MRI scans. 
D.W.E. provided pathology review. All authors contributed to writing the manuscript. 


Author Information Reprints and permissions information is available at 
www.nature.com/reprints. The authors declare no competing financial interests. 
Readers are welcome to comment on the online version of this article at 
www.nature.com/nature. Correspondence and requests for materials should be 
addressed to RJ.G. (Richard.Gilbertson@stjude.org). 


23/30 DECEMBER 2010 | VOL 468 | NATURE | 1099 


©2010 Macmillan Publishers Limited. All rights reserved 


LEG Ss 


doi:10.1038/nature09584 


mTORCI controls fasting-induced ketogenesis and 


its modulation by ageing 


Shomit Sengupta’, Timothy R. Peterson’, Mathieu Laplante’*?, Stephanie Oh!?? & David M. Sabatini>** 


The multi-component mechanistic target of rapamycin complex 1 
(mTORC1) kinase is the central node of a mammalian pathway that 
coordinates cell growth with the availability of nutrients, energy and 
growth factors’. Progress has been made in the identification of 
mTORCI1 pathway components and in understanding their func- 
tions in cells, but there is relatively little known about the role of the 
pathway in vivo. Specifically, we have little knowledge regarding the 
role mTOCR1 has in liver physiology. In fasted animals, the liver 
performs numerous functions that maintain whole-body homeosta- 
sis, including the production of ketone bodies for peripheral tissues 
to use as energy sources. Here we show that mTORCI controls 
ketogenesis in mice in response to fasting. We find that liver-specific 
loss of TSC1 (tuberous sclerosis 1), an mTORC1 inhibitor’, leads to 
a fasting-resistant increase in liver size, and to a pronounced defect 
in ketone body production and ketogenic gene expression on fast- 
ing. The loss of raptor (regulatory associated protein of mTOR, 
complex 1) an essential mTORC1 component’, has the opposite 
effects. In addition, we find that the inhibition of mTORC1 is 
required for the fasting-induced activation of PPARa (peroxisome 
proliferator activated receptor a), the master transcriptional 
activator of ketogenic genes’, and that suppression of NCoR1 
(nuclear receptor co-repressor 1), a co-repressor of PPAR@’, reacti- 
vates ketogenesis in cells and livers with hyperactive mTORC1 sig- 
nalling. Like livers with activated mTORC1, livers from aged mice 
have a defect in ketogenesis**, which correlates with an increase in 
mTORCI signalling. Moreover, we show that the suppressive effects 
of mTORC1 activation and ageing on PPARa activity and ketone 
production are not additive, and that mTORC1 inhibition is suf- 
ficient to prevent the ageing-induced defect in ketogenesis. Thus, 
our findings reveal that mTORC1 is a key regulator of PPARa 
function and hepatic ketogenesis and suggest a role for mTORC1 
activity in promoting the ageing of the liver. 

Whereas mice lacking the mTORC1 components mTOR or raptor 
die in early embryogenesis®, mice treated with pharmacological inhibi- 
tors of mTORCI or with tissue-specific deletions of raptor or mTOR are 
viable, and are beginning to reveal diverse roles for mTORC1 in adult 
physiology®. To begin the study of mTORC1 in liver physiology, we 
determined the effects of fasting and feeding on hepatic mTORC1 
activity. In fasted mice, mTORC1 activity in the liver was low (Fig. la, 
Supplementary Fig. 1a), as detected by the phosphorylation of the ribo- 
somal S6 protein, an established marker of mTORC1 pathway activity. 
Refeeding led to an increase in phospho-S6 levels that was blocked by 
rapamycin, an mTORC1 inhibitor. mTORC1 activation preceded that 
of Akt (Fig. 1b), an effector of the insulin-activated PI3K pathway, which 
is consistent with mTORCI responding not only to insulin but also to 
other food-triggered signals, such as nutrients. 

We examined the functions of mTORC1 in the liver using genetically 
engineered mice with the liver-specific deletion of the gene for raptor, 
or TSCI1, a negative regulator of mTORCI (Fig. 1c) (Methods). We refer 
to mice lacking hepatic TSC1 or raptor as Li-Tscl*° or Li-Rap*° 
mice, respectively. In Li-Tscl KO mice, the mTORC1 pathway was 


constitutively active and not affected by fasting or feeding, while the 
loss of raptor eliminated mTORC]1 activity irrespective of feeding status 
(Fig. 1d). Compared to controls, TSC1 or raptor deletion led to an 
~40% increase or decrease, respectively, in liver mass, hepatocyte size, 
and protein content (Fig. le; Supplementary Fig. 1b, c). Whereas in 
control animals a 24-h fast caused a ~25% reduction in liver mass, the 
livers of Li-Tsc1“° mice were largely refractory to the shrinking effects 
of fasting. In addition, fasting did not further decrease the size of the 
already small livers of Li-Rap“° mice (Fig. le). Thus, mTORCI1 is 
strongly regulated by fasting and feeding and plays a major role in 
setting liver size in response to the nutritional state. 

We measured the levels of several serum and liver metabolites in 
control, Li-Tsc1“° and Li-Rap*®° mice that were fasted or given ad 
libitum access to food (Supplementary Fig. 1d). Because mTORC1 
activation suppresses Akt signalling’ (Supplementary Fig. le), we also 
examined Li-Ir®° (also known as LIRKO; ref. 8) mice that lack the 
Insulin Receptor in the liver and thus have attenuated Akt signalling’. 
Levels of most serum and hepatic metabolites were not significantly 
affected by TSC1 loss, except that fasted Li-Tsc1“° mice had markedly 
low serum ketones, a phenotype not shared by Li-Ir“° mice (Fig. 2a; 
Supplementary Fig. 1d, f, g). Compared to control animals, Li-Tsc1X° 
mice had decreased locomotor activity and body temperature upon fast- 
ing (Supplementary Fig. 1h, i), phenotypes also observed in other 
mutant mice with defective ketogenesis’. 

The ketone bodies, acetoacetate and f-hydroxybutyrate, are pro- 
duced by the liver primarily from fatty acids released by adipose tissue 
and are used by tissues to generate acetyl-CoA for energy production 
during fasting. The defect in ketone production in Li-Tsc1“° mice is 
not due to an impairment in fatty acid uptake by the liver, as they were 
unable to generate ketones even when given sodium octanoate, a fatty 
acid that freely diffuses into liver mitochondria and serves as a keto- 
genic substrate’ (Supplementary Fig. 2a). 

As Li-Tsc1®° mice have a defect in ketone production when fasted, we 
asked if Li-Rap*° mice could produce ketones when fed. Li-Rap*° mice 
given food ad libitum do not have elevated levels of ketones (Fig. 2a), 
perhaps because the serum fatty acids that are ketogenic substrates are at 
low levels in fed mice!’. Indeed, when Li-Rap*° mice were administered 
the ketogenic substrate sodium octanoate upon refeeding after a fast, 
they did produce ketones for several hours after feeding, even at times 
when ketone levels had dropped precipitously in control animals 
(Supplementary Fig. 2b). Furthermore, Li-Rap*© or rapamycin-treated 
mice do have elevated serum ketones at the short times after refeeding 
when the control animals already have baseline ketone levels 
(Supplementary Fig. 2c, d). 

The defect in ketogenesis in the Li- Tsc1®© mice is liver-autonomous, 
as liver tissue from these mice failed to oxidize fatty acids or produce 
ketones ex vivo (Fig. 2b). To confirm these results in cells in vitro, we 
developed a ketogenic media containing the PPAR agonist WY-14643 
(Methods) that induces ketone production in murine AML12 hepato- 
cytes (Fig. 2c). Consistent with the in vivo and ex vivo findings, the 
suppression of TSC1 or TSC2 inhibited, in a rapamycin-sensitive 
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Figure 1 | In the liver, mTORC1 activity is sensitive to fasting and feeding 
and regulates organ size. a, Images of liver sections from mice fasted for 24h 
or fasted and refed for 45 min, and co-stained for serine 240/244 
phosphorylated-S6 (p-S6; red) and DNA (blue). b, Mice were fasted for 24h or 
fasted and then refed for 15, 30 or 45 min, or injected with rapamycin 2 h before 
refeeding for 45 min. Liver lysates were analysed by immunoblotting for the 
indicated proteins and phosphorylation states. c, Immunoblot analyses for 
TSC1 or raptor protein in liver lysates from indicated mouse strains that have or 
do not have Cre recombinase expression in the liver. d, Indicated mice were 
killed after being given food ad libitum (fed AL), fasted for 24h (fasted), or 
fasted and refed for 1 or 2 h. Liver lysates were analysed by immunoblotting for 
the levels of S6 and serine 240/244 phosphorylated-S6. Control-a are 
Tsc1!0°?/°*P mice administered the empty adenovirus, and control-b are 
Rap”? mice not carrying the Albumin-Cre transgene. The same 
nomenclature is used in the subsequent figures. e, Gross images of livers from 
Li-Tsc1*° or Li-Rap*° mice that were fed ad libitum or fasted for 24h (fasted). 
Bar graph shows mean + s.d. normalized liver weight for n = 5. The percentage 
changes in liver weight compared to respective fed control mice are indicated. 
*P < 0.05 compared to respective fed control mice. 


fashion, ketone production by AML12 cells (Fig. 2c; Supplementary 
Fig. 3a). Taken together, our loss of function data indicate that 
mTORCI, in a liver-autonomous fashion, is a key regulator of ketone 
production in response to fasting. 

Because the nuclear hormone receptor PPAR«& is a master activator 
of the hepatic ketogenic gene expression program in response to fast- 
ing’, we asked if mTORCI controls ketogenesis by modulating PPAR« 
function. PPAR« transactivates its own gene as well as those for 
enzymes required for fatty acid oxidation and ketogenesis, such as 
Cptla, AOX, HMGCS2 and HMGCL”. Fasting increased the mRNA 
levels of PPAR« and its target genes in control, but not TSC1-null, 
livers (Fig. 2d). In contrast, loss of TSC1 did not block the fasting- 
induced increase in the mRNA for PEPCK, which is not a target of 
PPARza (Fig. 2d). In Li-Tsc1*° mice, WY-14643, the synthetic PPAR 
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Figure 2 | mTORCI1 inhibits ketogenesis and PPARza activity in a liver 
autonomous fashion. a, Fed mice were given ad libitum access to food and 
killed at the beginning of the day. Fasted mice were denied food for 24h and 
killed at the same time of day as the fed mice. Indicated values are mean + s.d. 
for n = 6; *P <0.05 compared to fasted control mice. b, Indicated 
measurements were made as described in Methods using liver tissue isolated 
from control or Li-Tsc1©° mice that had ad libitum access to food (fed) or had 
been fasted for 24h. Values are mean + s.d. for n = 4. *P < 0.05 compared to 
fasted control mice. c, AML12 cells stably expressing validated lentiviral 
shRNAs targeting GFP, TSC1 or TSC2 were placed in control or ketogenic 
media with or without 20 nM rapamycin. Total ketones in the culture media 
were determined after a 3-day incubation. Values are mean = s.d. for n = 6. 
*P < 0.05 compared to shGFP-expressing cells cultured in ketogenic media. 
n.s., no significant differences between bracketed values. d, mRNA levels were 
quantified by qRT-PCR in total RNA isolated from indicated liver samples. 
Values are mean + s.d. for n = 8. *P < 0.05 compared to fasted control mice. 
e, mTORCI activation inhibits, in a rapamycin-sensitive fashion, PPARa- 
target gene expression in cells in culture. mRNA levels were determined as in 
d from the AML12 cells used in c. Values are mean = s.d. for n = 6. *P < 0.05 
compared to shGFP-expressing cells growing in ketogenic media. f, Control 
and Li-Rap*° mice were fasted for 24h and then refed for 2h. Levels of 
indicated mRNAs were measured as in d. Values are mean = s.d. for n= 4. 
*P < 0.05 compared to refed control mice. 


agonist®, did not increase serum ketones or PPAR«a-target gene 
expression in the liver (Supplementary Fig. 2e, f), but did activate 
PPAR« in the small intestine (a secondary site of ketogenesis), 
confirming that Li-Tscl“° mice have liver-specific defects in PPARo 
function (Supplementary Fig. 2f). In accord with the in vivo findings, a 
knockdown of TSC2 in AML12 cells inhibited PPAR«& in a rapamycin- 
sensitive fashion (Fig. 2e). Lastly, as suggested by the capacity of 
Li-Rap*° mice to produce ketones in the fed state (Supplementary 
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Fig. 2b), feeding did not downregulate the expression of PPAR« and its 
target genes in the livers of Li-Rap*© mice (Fig. 2f) or in mice treated 
with rapamycin before feeding (Supplementary Fig. 2g). Because 
PPAR« overexpression did not restore PPAR«-target gene expression 
in livers or AML12 cells with activated mTORC1 (Supplementary Fig. 
4a-f), these results are consistent with mTORCI negatively regulating 
ketogenesis by preventing the activation of PPAR«. Unlike mice lack- 
ing PPAR«'’, those without TSC1 do not have hepatic steatosis 
upon fasting, perhaps because plasma triglyceride and hepatic micro- 
somal triglyceride transfer protein (MTTP) levels are increased in 
these mice (Supplementary Fig. 1d and data not shown), suggesting 
that mTORC1 promotes very low density lipoprotein assembly and 
secretion. 

In the fed state, PPAR« interacts with the NCoR1 and SMRT 
corepressors, which suppress ketogenic gene expression by recruiting 
histone deacetylases’*. Upon fasting, ligand-binding to PPAR« initiates 
corepressor release and the association of coactivators, like p300 or 
CBP, which activate ketogenic genes by recruiting histone acetylases’’. 
In the livers of control animals, fasting led to an increase in histone 
acetylation at PPAR« response element (PPRE)-containing promoters, 
which correlated with the loss and gain of NCoR1 and p300, respec- 
tively, from the promoters (Fig. 3a—c). In contrast, in TSC1-null livers, 
NCoRI did not exit the PPRE-containing promoters upon fasting, and 
p300 occupancy and histone acetylation remained in a fed-like state 
(Figs 3a-c). Raptor-null livers had the opposite phenotypes (Sup- 
plementary Fig. 5a, b) and TSC1 loss did not affect the binding of 
PPAR« to promoters (Supplementary Fig. 5c). In AML12 cells, the 
knockdown of TSC2 repressed the disassociation of NCoR1 from 
PPRE-containing promoters, even when the cells were treated with 
the PPAR« ligand (Supplementary Fig. 5d). The suppression of 
NCoRI restored ketone production and PPARa-target gene expression 
in TSC2-deficient AML12 cells and livers (Fig. 3d, e, Supplementary 
Fig. 6a-c), while the histone deacetylase inhibitor trichostatin A 
reversed the defect in ketone production in AML12 cells caused by 
mTORC] activation (Supplementary Fig. 6d). 

As these results suggested an active role for NCoR1 in the inhibition 
of ketogenesis and PPARa by mTORCI, we asked if mTORCI1 regu- 
lates NCoR1 function. Because mTORCI did not affect NCoR1 protein 
levels (data not shown), we considered the possibility that fasting and 
feeding control the nuclear localization of NCoR1 in an mTORCI1- 
dependent fashion. Quantitative imaging assays using an antibody that 
recognizes endogenous NCoR1 (Supplementary Fig. 7a) showed that 
NCoRI was present in both the cytoplasm and nuclei of hepatocytes in 
fed animals but only in the cytoplasm in fasted mice (Supplementary 
Fig. 8a). The loss of TSC1 led to the presence of NCoR1 in the nucleus 
even in fasted mice (Supplementary Fig. 8a), while that of raptor pre- 
vented the feeding-induced movement of NCoRI into the nucleus 
(Supplementary Fig. 8b). Analogous results were obtained with endo- 
genous or epitope-tagged recombinant NCoRI1 in cultured AML12 
cells (Supplementary Figs 7b, 8c). Thus, mTORC1 regulates the sub- 
cellular localization of NCoR1, providing a potential mechanism for 
how mTORCI might control PPAR« and ketogenesis. 

Given that mTORC1 regulates ketogenesis and previous work 
showing that ageing blunts PPARa-target gene expression and ketone 
production in rodents*’, we asked if mTORC1 mediates the effects of 
ageing on ketogenesis. If this were the case, the inhibitory effects on 
ketogenesis of mTORCI activation and ageing should not be additive. 
Indeed, while the loss of TSC1 in young mice reduced serum ketones 
and hepatic PPARa-target gene expression, the deletion of TSC1 in 
aged mice (Supplementary Fig. 3c) did not further reduce the already 
low levels of serum ketones and PPARa and Cptla mRNAs observed 
in old animals (Fig. 4a, b). Furthermore, in aged mice, fasting did not 
inhibit liver mTORC1 activity to nearly the same degree as in young 
mice (Fig. 4c). Consistent with this defect in mTORC1 inhibition, 
ageing greatly impaired the fasting-induced exit of NCoR1 from the 
nucleus and the release of NCoRI from PPRE-containing promoters 
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Figure 3 | mTORCI requires the NCoR1 corepressor to inhibit PPARa 
function. a—c, At PPRE-containing promoters, mTORCI activation in the liver 
prevents fasting-induced histone H4 acetylation (a), p300 occupancy (b), and 
release of NCoRI (c). Liver extracts from control or Li-Tsc1*° mice that were fed 
or fasted as in Fig. 2a were subjected to ChIP assays (Methods). Values are 
mean = s.d. relative enrichment values for n = 4. *P < 0.05 compared to fasted 
control mice. d, e, AML12 cells stably expressing lentiviral shRNAs targeting 
GFP, NCoR1, TSC2, or NCoRI and TSC2 were incubated in control or ketogenic 
media for 3 days and total media ketones (d) and indicated mRNA levels were 
determined as in Fig. 2d (e). Values are mean + s.d. for n= 5. *P <0.05 
compared to shGFP-expressing cells cultured in ketogenic media (d, e). 


(Fig. 4d, Supplementary Fig. 9a). In contrast, in Li-Rap“° mice, 
NCoRI remained cytoplasmic in the livers of both young and aged 
animals (Supplementary Figs 8b, 9a). Remarkably, aged Li-Rap*° 
mice did not suffer (unlike aged control mice) a reduction in ketone 
production or PPARa-target gene expression during fasting (Fig. 4e, f; 
Supplementary Figs 3d, 9b). Collectively, these findings establish that 
mTORCI1 activity mediates the suppression of hepatic ketogenesis 
induced by ageing. 
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In conclusion, mTORCI1 regulates ketogenesis and PPAR« activity 
in response to fasting and feeding as well as ageing. The control of 
NCoRI subcellular localization by mTORC1 may be how mTORC1 
regulates PPAR and ketogenesis. Because activated mTORCI1 inhi- 
bits PPAR function in cells treated with a PPARa agonist, mTORC1 
may also regulate PPARo through additional mechanisms, such as by 
preventing PPAR« from responding appropriately to ligand binding 
or by inhibiting PPAR« coactivators. The finding that mTORCI1 pro- 
motes an ageing phenotype in the liver is consistent with substantial 
evidence showing that inhibition of the TORC1 pathway elongates 
lifespan in diverse organisms'*'’”. If mTORC1 regulates PPAR« in 
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Figure 4| mTORC1 mediates the ageing-induced inhibition of PPARa 
function and ketone production. a, b, Young (2-8 months) or aged (20- 

24 months) mice having the conditional null allele of TSC1 (Tsc 1°?) were 
injected with an empty adenovirus or a Cre recombinase-expressing 
adenovirus. Two weeks later, total serum ketones (a) and indicated liver mRNA 
levels (b) were measured in fed or fasted mice. Values are mean + s.d. for n = 4. 
*P < 0.05 compared to fasted young mice administered an empty adenovirus. 
c, Immunoblotting was used to measure the levels of serine 240/244 
phosphorylated and total S6 in the livers of fed or fasted control mice that were 
young or aged. d, ChIP assays were performed on indicated liver extracts 
(Methods). Values are mean = s.d. for n = 5. *P << 0.05 compared to fasted 
young mice. e, f, Indicated liver mRNA levels (e) and total serum ketones 

(f) were measured in fed or fasted control or Li-Rap*° mice that were young (2- 
8 months) or aged (20-24 months). Values are mean ~ s.d. forn = 5.*P < 0.05 
compared to fasted young control mice. 


other organs besides the liver, our findings may be relevant to the 
ageing-induced decline in PPAR» function that is known to occur in 
organs besides the liver’®. 


METHODS SUMMARY 


All animal studies and procedures were approved by the MIT Institutional Animal 
Care and Use Committee. Mice were given chow ad libitum, fasted or refed for the 
indicated times and indicated serum metabolites were measured. Hepatic meta- 
bolite measurements (20), ex vivo fatty acid oxidation (19) and ketogenesis assays 
(19), and sodium octanoate(10), rapamycin (21) and WY 14643 (3) administra- 
tions were performed as previously described*''?*'. Chromatin immunopreci- 
pitation assays were performed using a kit from Millipore according to the 
manufacturer’s instructions. Significance P values were obtained by performing 
non-paired, two-tailed Student’s t-tests to compare two groups. 


Full Methods and any associated references are available in the online version of 
the paper at www.nature.com/nature. 
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METHODS 

Materials. TSC1*"””*” mice and Li-Ir<° mice were gifts from D. Kwiatkowski 
(Harvard Medical School) and C.R. Kahn (Joslin Diabetes Center), respectively. 
Antibodies to TSC1, raptor, phospho-S235/236 S6, phospho-240/244 S6, and S6 
were purchased from Cell Signaling Technology; the antibody to NCoR1 for ChIP 
experiments and immunofluorescence assays from Abcam (ab24552); the antibody 
to NCoRI1 for immunofluorescence studies from Thermo Scientific (PA1-844A); 
the p300 (sc-585x) and PPAR« (sc-9000x) antibodies from Santa Cruz Bio- 
technology; antibodies to the acetyl-histone H4 (06-866) from Millipore; and the 
Cy3-conjugated secondary antibody from Invitrogen. WY-14643 was purchased 
from Cayman chemicals; rapamycin from LC Labs; dexamethasone, transferrin, 
insulin, sodium octanoate, selenium, trichostatin A, and oleic acid from Sigma; 
radiolabelled oleic acid from Perkin Elmer; PPAR«&-expressing adenovirus from 
Vector Biolabs; high-titre empty adenovirus and Cre recombinase-expressing- 
adenovirus from the Gene Transfer Vector Core at the University of Iowa; and 
AML12 cells from ATCC. Lentiviral shRNAs targeting murine TSC1, TSC2 and 
NCoRI were obtained from The RNAi Consortium (TRC) collection of the Broad 
Institute” and from Sigma-Aldrich. The sequences for each shRNA are as follows. 
shNCoR1, CCGGCCTCTAATACAGGCACTTCAACTCGAGTTGAAGTGCC 
TGTATTAGAGGTTTTTG; shTscl, CCGGGCCAGTGTTTATGCCCTCTTTC 
TCGAGAAAGAGGGCATAAACACTGGCTTTTTG; shTsc2, CCGGGCCCGA 
TATGTGTTCTCCAATCTCGAGATTGGAGAACACATATCGGGCTTTTTG; 
shGFP, CCGGTACAACAGCCACAACGTCTATCTCGAGATAGACGTTGTG 
GCTGTTGTATTTTT. 

Generation of Li-TscI© and Li-Rap*° mice. Mice carrying a floxed allele of 
TSC1 (Tsc1!0*?/?*?)23 were backcrossed to C57BL/6 mice for three generations, 
and then bred to homozygosity. For liver-specific recombination of the floxed 
TSCI1 allele, 100 ul of high-titre adenoviral-Cre ((3-6) x 10° p.fu. ml ') was 
administered via retro-orbital injection under isoflurane anaesthesia to mice 
ranging from 2 to 20 months in age, and subsequent experiments were performed 
within a month. PCR genotyping for the floxed and recombined allele was per- 
formed as previously described”. For generation of Li-Rap*® mice, a BAC clone 
(identifier: RP24-125C11, strain: C57BL/6J) containing raptor exon 6 was 
obtained from the RPCI-24 mouse genomic DNA library”. Standard PCR and 
cloning procedures were used to generate fragments spanning 4.3 kb upstream of 
raptor exon 6 and 2.6 kb downstream of and including raptor exon 6 (as well as a 3’ 
LoxP site downstream of raptor exon 6) that were subsequently assembled 
together into the PGKneoF2L2DTA vector**. In the final targeting construct, 
the neomycin resistance (neo) cassette was flanked by one 5’ LoxP site and 5’ 
and 3’ Flp recognition target (Frt) recombination sites. Raptor exon 6 was flanked 
bya 5’ and 3’ LoxP site located 111 bp upstream and 547 bp downstream, respec- 
tively. The targeting vector was linearized and electroporated into embryonic stem 
cells derived from 129/SvEv mice*’. Clones were analysed for correct integration 
by Southern blot and PCR analysis. Chimaeric mice were obtained by microinjec- 
tion of the correctly targeted clones into BALB/C blastocysts and crossed with 
C57BL/6 mice to obtain offspring with germline transmission. Mice analysed in 
this study were backcrossed to C57BL/6 for four generations. Some of the back- 
crosses involved mice constitutively expressing the FLP recombinase so as to 
excise the neomycin cassette from the targeted allele’. Mice were bred to homo- 
zygosity for the floxed raptor allele (Rap”*”””*”), and then crossed to mice expres- 
sing the Cre-recombinase transgene from the liver-specific albumin promoter”. 
PCR genotyping of Rap!?*?”*? and Li-Rap*° mice was performed with the fol- 
lowing primers that detect the following. 3’ LoxP site of targeted raptor allele: 
forward, CTCAGTAGTGGTATGTGCTCAG; reverse, GGGTACAGTATGTC 
AGCACAG. This PCR reaction generates an amplicon of 174bp when the 3’ 
LoxP site is present and of 140 bp when the wild-type allele is present. 

Albumin Cre recombinase: forward, GTTAATGATCTACAGTTATTGG; 
reverse, CGCATAACCAGTGAAACAGCATTGC. This PCR reaction generates 
an amplicon of ~500 bp and indicates the presence of the transgene. 

In all experiments involving the Li-Tsc1 KO mice, the control mice were Tscl 
mice that were administered ‘empty’ adenovirus of a similar titre as the Cre- 
expressing adenovirus (called ‘control-a’ mice in figures). For all experiments 
involving Li-Rap*° mice, the control mice were Rap’“””"*” mice that did not have 
the Albumin Cre transgene (‘control-b’ mice in figures). For the experiment 
involving Li-IrS° mice, the control mice were Ir'°*?"°*? mice that did not have 
the Albumin Cre transgene (‘control-c’ mice in figures). No significant changes in 
body weight, adiposity or satiety were observed in Li-Tsc1 KO and Li-Rap*° mice 
compared to their respective controls. The increased liver size of Li-Tsc1®° mice 
did not affect the sizes of other organs. 

Animal experiments. WY-14643 was suspended in 10% DMSO, 90% corn oil at 
1.5mg ml‘ and was administered via oral gavage to mice for 5 consecutive days at 
a dose of 25 mg kg” '(ref. 29). 500 mM sodium octanoate in 0.9% sodium chloride 
was given to mice via intra-peritoneal injections at a dose of 6 pl per gram of body 


loxP/loxP 
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weight (ref. 30). Rapamycin was given to mice via intra-peritoneal injections at a 
dose of 10 mg per kg body weight (ref. 31). For hepatic overexpression of GFP and 
PPAR«, mice were administered high-titre adenovirus expressing either CDNA via 
retro-orbital injection under isoflurane anaesthesia and killed 5 days later. For 
depletion of NCoR1 in the liver, mice were administered high-titre adenovirus 
expressing shRNAs targeted to NCoR1 or lacZ via retro-orbital injection under 
isoflurane anaesthesia and killed 6 days later. Low titre adenovirus expressing 
shRNAs were constructed using the Block-iT Adenoviral RNAi Expression 
System (Invitrogen) per manufacturer’s instructions. High-titre virus was generated 
by infecting HEK-293T cells with low-titre virus (10'° pfu. ml~'), waiting until 
some cell death was observed, and then concentrating 200 ml of the culture media 
into 1 ml using the Vivapure AdenoPACK 100 (Sartorius Stedim Biotech). Fasting 
experiments began at lights out and ended after the times indicated in the figures. 
Activity measurements were performed in cages where infrared light beams were 
placed every 1.5 inches along the length of the cage and beam breaks were measured 
using a digital counter. Body temperature was measured using an anal probe 
accurate to 0.1°C. All experiments were carried out with approval from the 
Committee for Animal Care at MIT and under supervision of the Department of 
Comparative Medicine at MIT. 

Immunofluorescence assays. Immunofluorescence-based imaging of NCoR1 
was performed as follows: fixed liver tissue was embedded in paraffin and 
3-5 tM thick sections placed on microscope slides. Paraffin-coated sections were 
then de-waxed using EZ-DeWax deparaffinization solution (BioGenex), placed in 
boiling citrate buffer, pH 6.0 for ten minutes, and then blocked in PBS with 0.1% 
Tween-20 and 5% goat serum. Sections were stained overnight with the primary 
antibody in blocking solution at 4°C, washed 3 times in PBS/Tween and then 
incubated at room temperature with Cy3-conjugated secondary antibody in 
blocking solution for 45 min. Sections were incubated in Hoechst solution for 
10 min to stain the DNA, and then coverslipped. Immunofluorescence assays in 
AMLI2 cells were performed as previously described*’. Tiled images were 
obtained from an inverted epifluorescence microscope (Zeiss) and the exposure 
time for each channel was kept constant for all slides on a given day. Signal 
intensity was quantified using ImageJ (NIH) as described below. The PA1-844A 
antibody (Thermo Scientific) was used for the NCoR1 immunofluorescence studies 
shown in the figures. Equivalent results were also obtained with the ab24552 
(Abcam) antibody. 

Image analysis. Quantification of fluorescence intensity, pixel location, and hepa- 
tocyte size were performed using the NIH software ImageJ. Greyscale 1,048 * 792 
pixel images acquired at 63 magnification of cells immunostained with the anti- 
NCoR1 antibody were used for measuring cellular NCoR1 localization. For each 
condition, 6 cells from 2 individual images from each liver were measured for a 
total of 12 cells per liver and thus 60 cells per condition. A line 90 pixels in length, 
which was sufficiently long to span the nucleus and some of the surrounding 
cytoplasmic area of all cells, was placed over each cell, such that the midpoint of 
the line was over the centre of the circle defined by the nucleus. The pixel intensity 
along the line was then recorded. Given the variability in nuclei size, the width of 
multiple nuclei per experimental group was also measured. A shaded area, whose 
width equals the mean diameter of the nuclei plus one standard deviation, was 
then superimposed upon the plots of pixel intensity of the NCoR1 staining to 
indicate the cellular location of the pixel intensity measurements. 

Serum metabolite and hepatic measurements. Tail blood or blood obtained 
from retro-orbital bleeds at the time of death was centrifuged at low speed at 
4°C for 30 min and serum isolated for metabolite measurements. Total ketones 
were measured using a colorimetric assay from Wako Chemicals according to 
manufacturer’s protocol. Total glucagon levels were measured using an ELISA kit 
from Wako Chemicals. True serum triglycerides were measured using a kit from 
Sigma, and values represent total triglycerides minus serum glycerol. Non-esterified 
free fatty acids were measured with a colorimetric assay from Roche, while serum 
insulin was measured using an ELISA kit from DSLabs. Serum glucose measure- 
ments were taken from tail blood using an instant glucometer (Ascensia Elite). 
Hepatic triglycerides were extracted as previously described**, and measured as 
above. 

Quantitative RT-PCR. Total RNA was isolated from cells and tissues using the 
RNeasy kit from Qiagen. Equal amounts of total RNA for each sample was used for 
oligo-d(T) (Invitrogen) primed reverse transcription into cDNA using SuperScript 
II (Invitrogen). Primers for real-time PCR were obtained from Integrated DNA 
Technologies. Reactions were run on an Applied Biosystems Prism machine using 
Sybr Green Master Mix (Applied Biosystems). The amount of B-actin cDNA was 
used to normalize results from gene-specific reactions. Primer sequences used to 
produce gene-specific amplicons are as follows. NCOR1: forward, GAAGCCACA 
GCAGAAGAACC; reverse, ACGACCATGTTCTACCAGGC. HMGC32: for- 
ward, ATACCACCAACGCCTGTTATGG; reverse: CAATGTCACCACAGA 
CCACCAG. PPARA: forward, AGAGCCCCATCTGTCCTCTCG; reverse, 
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ACTGGTAGTCTGCAAAACCAAA. CPTla: forward, CCATGAAGCCCTCA 
AACAGATC; reverse.) ATCACACCCACCACCACGATA. TSC1: forward, 
ATGGCCCAGTTAGCCAACATT; reverse, GCTGAGAATTGGTTTCCAGGT. 
B-Actin: forward, GGCTGTATTCCCCTCCATCG; reverse: CCAGTTGGTAA 
CGCCATGT. HMGCL: forward, ACTACCCAGTCCTGACTCCAA; reverse: 
TAGAGCAGTTCGCGTTCTTCC. PEPCK: forward, CGATGACATCGCCTGG 
ATGA; reverse, TCTTGCCCTTGTGTTCTGCA. ACOX: forward, GCCTGAG 
CTTCATGCCCTCA,; reverse, ACCAGAGTTGGCCAGACTGC. 

Ex vivo liver measurements. The hepatic ex vivo fatty-acid oxidation assay was 
performed as previously described™’. Briefly, livers were removed from mice and 
three ~40 mg portions were placed in individual wells of a 24-well plate along with 
1 ml of Krebs-Ringer saline and 1 mM *H-oleic acid-BSA (final concentration was 
1 wCi per mM *H-oleic acid). Livers were incubated at 37°C and 5% oxygen for 
2h. After two hours, the medium was removed, 10 ul was retained for ketone 
measurements, and the rest transferred to an Eppendorf tube with no cap. 
Tubes were placed in scintillation vials that contained 2 ml of water, wrapped in 
aluminium foil, and incubated overnight at 65°C. The next day the vials were 
cooled at 4°C for 30 min, the Eppendorf tube was removed, scintillation fluid was 
added, and the levels of 3HZ,0 were measured using a scintillation counter. 

In vitro ketogenesis in AML12 cells. For passaging, AML12 cells were cultured in 
medium prescribed by ATCC that contains insulin and serum. For induction of 
ketogenesis, cells were first grown to confluence, washed once with PBS, and then 
incubated in media devoid of serum and insulin and containing 50 14M WY-14643 
and 2mM sodium octanoate. At indicated times, aliquots of the media were 
removed and total ketone levels measured. Total RNA was also isolated from each 
well and used to normalize media ketone levels between wells. When employed, 
lentiviral shaRNAs were used as described**. Using qRT-PCR, all shRNAs were 
validated to knockdown their respective targets by at least 70% (Supplementary 
Fig. 2a, b). 

Chromatin immunoprecipitation assays. Chromatin immunoprecipitation 
assays were performed using a kit from Millipore according to the manufacturer’s 
instructions. Liver portions were crosslinked in 1.5% formaldehyde in PBS for 
15 min at room temperature, and the reaction was quenched with 0.125 M glycine. 
Liver cells were disaggregated using 20 gauge syringe needles. Resulting cells were 
then lysed in 1% SDS lysis buffer for 10 min, and sonicated at 30s intervals for a 
total of 5 min. Resulting lysates were diluted in buffer containing 0.01% SDS, 1.1% 
Triton, 1.2 mM EDTA, 167 mM NaCland 16.7 mM Tris-HCl. Diluted lysates were 
pre-cleared with protein-A agarose/salmon sperm and then incubated overnight 
at 4°C with 3 ul of NCoR1 antibody, 5 pl of acetyl-histone H4 antibody, and 4 ug 
of p300 antibody or 4 tg IgG per 10° cells. The next day, the antibodies were 
captured with protein-A agarose/salmon sperm for 1h, pelleted, and subjected 
to washes in low salt, high salt, LiCl and TE buffers contained in the Millipore kit. 


The chromatin was eluted from antibodies with 1% SDS and 0.1 M NaHCOs, and 
the crosslinks reversed by heating at 65 °C for 4h. The chromatin was treated with 
proteinase K, purified using the High Pure PCR Template Preparation Kit from 
Roche, and used for PCR analysis. AML12 cells were processed with the same 
protocol except beginning at the step in which cells are lysed in 1% SDS lysis buffer. 
The primers below were used to amplify a 150-200 bp amplicon encompassing 
the PPRE of the indicated genes. As both a negative control and to ensure that 
proper shearing length was achieved for genomic DNA, PCR was also performed 
for an amplicon within the second intron of Cptla. PPARA PPRE: forward, 
TTCCGAACCATTCTTTCCAG; reverse, GCTGCCTTCTTTTGCAGAGT. 
HMGCS2 PPRE: forward, TGAGCCACTCAGCAGAGGAATCAG; reverse: 
CTGGGTTGGGCTTTATAAGACTCC. CPT1la PPRE: forward, CTTTCCTA 
CTGAGGCCCAGATAG; reverse: TACAGCCTAGAACCCTGACTGC. CPT la 
Intron: forward, CTGGTTGGAATAGGTGTGTCACTG; reverse, ATTGGGGC 
TGGCTTACAGGTTC. 
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The histone variant macroH2A suppresses 
melanoma progression through regulation of CDK8 
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Cancer is a disease consisting of both genetic and epigenetic 
changes. Although increasing evidence demonstrates that tumour 
progression entails chromatin-mediated changes such as DNA 
methylation, the role of histone variants in cancer initiation and 
progression currently remains unclear. Histone variants replace 
conventional histones within the nucleosome and confer unique 
biological functions to chromatin’*. Here we report that the his- 
tone variant macroH2A (mH2A) suppresses tumour progression 
of malignant melanoma. Loss of mH2A isoforms, histone variants 
generally associated with condensed chromatin and fine-tuning of 
developmental gene expression programs’**, is positively corre- 
lated with increasing malignant phenotype of melanoma cells in 
culture and human tissue samples. Knockdown of mH2A isoforms 
in melanoma cells of low malignancy results in significantly 
increased proliferation and migration in vitro and growth and 
metastasis in vivo. Restored expression of mH2A isoforms rescues 
these malignant phenotypes in vitro and in vivo. We demonstrate 
that the tumour-promoting function of mH2A loss is mediated, at 
least in part, through direct transcriptional upregulation of CDK8. 
Suppression of CDK8, a colorectal cancer oncogene’*, inhibits 
proliferation of melanoma cells, and knockdown of CDKS8 in cells 
depleted of mH2A suppresses the proliferative advantage induced 
by mH2A loss. Moreover, a significant inverse correlation between 
mH2A and CDK8 expression levels exists in melanoma patient 
samples. Taken together, our results demonstrate that mH2A is a 
critical component of chromatin that suppresses the development 
of malignant melanoma, a highly intractable cutaneous neoplasm. 

The H2A family is the largest and most diverse histone family, and 
includes vertebrate-specific mH2A1 (splice variants mH2A1.1 and 
1.2) and mH2A2 (refs 1,3,9-11), which are generally associated with 
transcriptionally repressed chromatin'*’’. However, mH2A is widely 
distributed throughout chromatin*® and exists in post-translationally 
modified forms*”*, suggesting additional unidentified functions for 
this variant. 

Given increasing evidence for variant-mediated transcriptional con- 
trol'* and recent reports describing variants as prognostic markers in 
cancer’*"*, we hypothesized that global alteration of variants could 
contribute to malignant melanoma, the most lethal form of skin cancer 
with rising incidence’”"*. Its radial growth phase (RGP) is characterized 
by lateral melanocyte growth and its vertical growth phase (VGP) by 
spread of melanoma cells into the dermis and subcutis, upon which 
metastasis can occur’. 

Using well characterized, paired series of murine and human mel- 
anoma cells lines, we probed the H2A variant profile. The murine B16 
series represents cells of increasing metastatic potential”, and the 
human series of a primary melanoma (WM115) and two subsequent 


skin metastases derived from this same patient (WM266-4 and 
WM165-1)”. In highly malignant cells of the murine and human 
series, a global decrease of mH2A1 and mH2A2 protein and messenger 
RNA (mRNA) was observed (Fig. la and Supplementary Figs 1 and 2). 
Analysis of histones from both series using multiplexed quantitative 
mass spectrometry” confirmed these findings (Supplementary Fig. 1). 
Furthermore, mH2A1 and mH2A2 loss was observed in a panel of 
primary and metastatic melanoma cells (Supplementary Fig. 1). 
Interestingly, a 1.5- to threefold increase in H2A.Z levels (often asso- 
ciated with promoters of active genes)” was also observed (Fig. 1a and 
Supplementary Fig. 3), implicating possible H2A variant exchange 
during melanoma progression. Consistent with a global loss of 
mH2A and increased H2A.Z levels, we observed highly decondensed 
chromatin in B16-F10 cells by micrococcal nuclease digestion (Sup- 
plementary Fig. 2). 

Next, we performed immunohistochemistry (IHC) on approxi- 
mately 115 human tissues ranging from benign nevi to metastatic 
melanoma (tissue set 1, Supplementary Fig. 4). mH2A2 antibody 
was used for IHC, as it produced clear nuclear staining, and tissues 
were independently scored (0-3) by two blinded dermatopathologists 
with excellent interobserver consistency (« = 0.80). IHC demon- 
strated that although mH2A2 is abundant in melanocytes of benign 
nevi and RGP lesions, its expression is dramatically lost in more than 
80% of VGP and metastatic melanomas (P< 0.001) (Fig. 1b and 
Supplementary Fig. 4). This suggests mH2A loss occurs during the 
critical RGP to VGP transition. IHC was also performed on 25 mel- 
anomas with known BRAF status (D.P., unpublished observations)”*. 
An activating mutation of BRAF, V600E, is present in approximately 
65% of melanomas™. Although this data set did not reveal a significant 
correlation between mH2A2 loss and V600E mutation, it produced 
similar mH2A2 results as the first cohort, as did a tissue microarray 
(Supplementary Fig. 4). Using fresh tissues, we observed significantly 
reduced levels of mH2A1 and mH2A2 mRNA in metastatic melanoma 
specimens compared with nevi and primary melanocytes (Fig. Ic). 

Owing to the transcriptional downregulation of mH2A in human 
melanoma, combined with its re-expression in metastatic melanoma 
cells upon 5-Aza-2'-deoxycytidine treatment (Supplementary Fig. 5), 
we hypothesized that DNA methylation may enable silencing of mH2A. 
Indeed, through extensive bisulphite sequencing analysis, we identified 
a region of the mH2A2 promoter that is significantly methylated in 
metastatic melanoma tissues and cell lines, but not in primary melano- 
cytes, WM115 cells or benign nevi (Fig. 1d and Supplementary Fig. 5). 

Collectively, these findings prompted us to examine the functional 
consequences of mH2A loss. We established multiple stable short 
hairpin RNA (shRNA) lines in murine B16-FO and Fl and human 
WM115 melanoma cells, targeting mH2A1, mH2A2 and control 
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Figure 1 | mH2A loss correlates with increasing melanoma malignancy. 
a, Melanoma cells probed for H2A variants; core histones used for loading. 
b, IHC of human tissue with mH2A2 (left) and histone H3 (right). Original 
magnifications X20 and 40 are shown. mH2A2 visualized using DAB 
(brown) and haematoxylin (blue). Arrows depict mH2A2 staining in non- 
melanocytic cells. c, (RT-PCR of mH2A1 and mH2A2 in benign nevi and 
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melanocytes (brown circles) and metastatic melanoma (black squares); 
P<0,0001 d, DNA methylation of mH2A2 promoter in nevi (n = 6, 10-12 
clones per nevus) and metastatic melanoma tissues (n = 7, 10-14 clones per 
tissue); 16 CpG sites shown. Open circles (unmethylated), black circles 
(methylated); P values determined by Mann-Whitney U test. 


Figure 2 | mH2A depletion and ectopic 
expression alter malignant properties of 
melanoma cells in vitro and in vivo. a, Colony 
assay of B16-F1 shRNA-expressing cells. 
Quantified (below); *P < 0.0000005; 

mean + s.e.m. (n = 4). b, Soft agar assay of 
WM115 shRNA-expressing cells. Quantified 
(below); *P < 0.0005; mean = s.d. (n = 4). 

c, Tumour volume (in cubic millimetres) after 
subcutaneous injections of B16-F1 shRNA cells; 
*P <0.05 at day 14; mean + s.e.m. (n = 10 mice 
per group). d, Trans-well migration assay in B16- 
Fl and WM115. Quantification below; 
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green fluorescent protein (GFP) (Supplementary Figs 6 and 7). Two 
shRNA-transduced cell lines from mouse and human were used for in- 
depth analysis: B16-Fl1 mH2A1_91 and mH2A2_25 and WM115 
mH2A1_90 and mH2A2_05 (Supplementary Figs 6 and 7 for addi- 
tional shRNA lines and isoform-specific knockdown). 

Proliferation was examined in shRNA-expressing cells by colony 
formation and MTS cell viability assays. The loss of mH2A increases 
proliferation of murine and human melanoma cells (Fig. 2a and Sup- 
plementary Figs 6 and 7), as well as anchorage-independent growth of 
WM11S5 cells (Fig. 2b and Supplementary Fig. 7).'To examine growth 
potential in vivo, B16-F1 shRNA cell lines were injected subcuta- 
neously into mice; mH2A-deficient cells exhibited significantly 
enhanced tumour growth compared with controls (Fig. 2c). mH2A 
knockdown was confirmed by immunoblotting lysates from tumours 
(Supplementary Fig. 8). 

Because cell motility contributes to metastasis and melanocytes 
originate from migratory neural crest cells, shRNA lines were analysed 
for migratory behaviour. Loss of mH2A in murine and human cells 
enhanced migration through an 8-m Transwell and the ability to 
close an artificial wound, compared with control cells (Fig. 2d and 
Supplementary Figs 6 and 7). Next, murine shRNA cell lines were 
injected into the lateral tail veins of mice to assay metastatic potential. 
Fourteen days after injection, mice were killed and lungs dissected for 
macro- and microscopic histology (Fig. 2e). Lungs of mice injected 
with mH2A1_91 and mH2A2_ 25 cells showed a five- and 30-fold 
increase, respectively, in the number of macroscopic metastases com- 
pared with lungs from control mice (Fig. 2e). Haematoxylin and eosin 
and Ki-67 staining revealed metastatic disease with proliferation, 
respectively, in mH2A shRNA-expressing tumours; mH2A knock- 
down in lungs was confirmed by IHC (Supplementary Fig. 8). 

Next, mH2A expression was stably restored in malignant B16-F10 
and human WM266-4 and A375 cells. The core histones H2A, mH2A1 
(1.2) and mH2A2 were fused to mCherry and functional stable lines 
generated (Supplementary Figs 9, 11 and 13). Ectopic expression of 
mH2A1.2 and mH2A2, but not H2A or mCherry alone, resulted in 
reduced proliferation (without evidence of apoptosis) and migration 
(Supplementary Figs 10, 12 and 14). Human A375 cells expressing the 
mCherry series were injected subcutaneously into the flanks of immuno- 
compromised mice; expression of mH2A1.2 and mH2A2 suppressed 
growth (Supplementary Fig. 14). Furthermore, B16-F10 cells expressing 
mCherry fusions were injected into tail veins of mice; mH2A1.2 and 
mH2A2 significantly suppressed metastasis to the lungs (Fig. 2f and 
Supplementary Fig. 10). 

Given the striking phenotypes of mH2A manipulation in melanoma 
cells, we hypothesized that loss of mH2A may alter the transcriptional 
state of proliferation- and metastasis-related genes. We performed 
gene expression profiling using Affymetrix microarrays with B16-F1 
cells (mH2A1_91, mH2A2_25 and sh_GFP). As expected for mH2A’s 
role in fine-tuning of gene expression’, many genes showed a less than 
twofold change (Supplementary Figs 15, 16 and 17 for Venn diagrams, 
heatmaps and gene ontology). Fifteen genes showed at least a twofold 
change, common to both shRNA lines in two independent experi- 
ments, including Integrin alpha 4 (Itga4), transcriptional regulators 
CDK8 (Mediator complex component) and Cited1 (CBP/p300 trans- 
activator) (Fig. 3a). In concordance with our data, Itga4 expression is 
inversely correlated with invasive potential of B16 cells** and expres- 
sion profiling of human melanoma cells identified Cited1 loss in meta- 
static cells**. CDK8, however, is a new player in melanoma malignancy. 

In concordance, CDK8 mRNA and protein levels were elevated both 
in murine and human cells depleted of mH2A (Fig. 3a-c), as well as 
subcutaneous tumours and lung metastases derived from mH2A- 
deficient lines (Supplementary Fig. 18). Next we stably expressed 
RNA interference (RNAi)-resistant mH2A2 and H2A in the B16-F1 
mH2A2_24 shRNA line, which targets the 3’ untranslated region of 
mH2A2. Although CDK8 levels remained high in mH2A2_24 cells 
expressing H2A-mCherry, addition of mH2A2-mCherry rescued 
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Figure 3 | Microarray and ChIP analyses identify CDK8 as a direct mH2A- 
regulated gene in melanoma. a, Heat map representing gene expression 
changes (at least twofold) in B16-F1 mH2A-deficient cells. RT-PCR of CDK8 
(bottom); mean + s.d. (1 = 3). b, Immunoblots of CDK8 in murine shRNA 
lines; actin for loading. c, Immunoblot and qRT-PCR of CDK8 in human 
shRNA lines; mean + s.d. (n = 3). d, CDK8 and DsRed immunoblots of 
mH2A2_24 line expressing H2A- and mH2A2-cherry; asterisk depicts fusion 
proteins. e, CDK8 qRT-PCR analysis in A375 subcutaneous tumours; 

mean + s.d. (n = 3). f, mH2A1 ChIP analysis of the —1 kb position from TSS 
for CDK8, PACS2, ATP5G1 and GAPDH; intergenic control. IgG used as 
control antibody; mean = s.d. (nm = 3). 


CDK8 expression to B16-F1 levels (Fig. 3d). Moreover, CDK8 was 
downregulated in subcutaneous tumours derived from A375 cells 
expressing mH2A1- and mH2A2-mCherry (Fig. 3e and Supplemen- 
tary Fig. 14). 

Intrigued by transcriptional upregulation of CDK8, a colorectal 
cancer oncogene”®, we enquired if CDK8 is a direct target of mH2A. 
Chromatin immunoprecipitation (ChIP) analysis demonstrated that 
the CDK8 promoter is enriched in mH2A 1-containing nucleosomes in 
B16-F1, but absent in B16-F10 cells and a control shRNA line (Fig. 3f 
and Supplementary Fig. 18). ChIP analysis of additional mH2A target 
genes (but not an intergenic locus or GAPDH) also revealed enrich- 
ment, and demonstrated that CDK8 is a highly enriched mH2A target 
gene (Fig. 3f). 

By examining a panel of human cell lines, we observed high CDK8 
protein levels in metastatic melanoma cells, comparable to that of 
colon cancer cells (Fig. 4a). We used shRNAs to deplete CDK8 from 
B16-F10 and human A375 and WM165-1 cells, which contain high 
levels of CDK8 (Fig. 4b, Supplementary Figs 19 and 20). The loss of 
CDKS8 significantly reduced proliferation (Fig. 4b) mediated by G2/M 
arrest in human cells (Supplementary Fig. 20). Conversely, ectopic 
expression of CDK8 (and a kinase-defective mutant, D173A)’ in 
B16-F1 cells resulted in significantly increased proliferation (Sup- 
plementary Fig. 21). Although proliferation in murine melanoma cells 
appears independent of the kinase activity of CDK8, it may be 
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Figure 4 | CDKS8 is a major effector of mH2A loss. a, Melanocytes, metastatic 
melanoma and colon cancer cells probed for CDK8; actin for loading. 

b, Immunoblot of A375 cells expressing CDK8 shRNA (left), MTS assay (right); 
mean + s.d. (n = 5). c, Tumour volume (in cubic millimetres) after 
subcutaneous injections; *P < 0.05 at day 12; mean = s.e.m. (n = 9 mice per 


consistent with recent studies demonstrating a kinase-independent 
role of CDK8 (ref. 27). 

To dissect the relationship between mH2A and CDK8, we first depleted 
CDK8 in mH2A shRNA-expressing cells (B16-F1 and WM115 lines). 
Knockdown of CDK8 was able to suppress the enhanced proliferation 
induced by mH2A loss in vitro and in vivo (Fig. 4c, d and Supplemen- 
tary Figs 21 and 22). Knockdown of Med12, a subunit of the CDK8 
submodule of Mediator’’, showed a similar effect (Fig. 4d and Sup- 
plementary Fig. 22), suggesting that CDK8 functions within the 
Mediator subcomplex in melanoma. 

Next, we performed quantitative PCR with reverse transcription 
(qRT-PCR) of mH2A and CDK8 in 36 fresh melanoma specimens. 
This analysis demonstrated a statistically significant inverse correla- 
tion of mH2A2 and CDK8 at the mRNA level (Pearson’s r = —0.406; 
P= 0.014; Fig. 4e). We performed IHC for CDK8 (x = 0.58) in human 
tissues previously scored for mH2A2, and observed strong CDK8 stain- 
ing (scored 2-3) in a large fraction of mH2A2 negative (scored 0) 
melanomas (29/38 = 76%; Supplementary Fig. 23). A similar inverse 
trend was observed in a panel of human melanoma cell lines 
(Supplementary Fig. 23). Finally, by probing fresh benign nevus tissues, 
we observed high mH2A and low CDK8 protein levels (Supplementary 
Fig. 23). Collectively, these results strongly suggest that CDK8 is a 
major effector of mH2A-mediated melanoma progression. 

Here we demonstrate that mH2A is globally lost during melanoma 
progression. Similar findings have recently been described in lung 
cancer; mH2A1.1 is enriched in pre-cancerous senescent cells, but lost 
upon bypass of senescence’®. However, the mechanism by which this 
occurs and its biological consequences remain unclear. Our study 
suggests mH2A loss in melanoma, mediated in part by DNA methyla- 
tion, occurs after a potential senescence bypass (that is, in a nevus), but 
rather during the critical RGP to VGP transition. Nevertheless, mH2A 
isoforms may serve as important biomarkers for melanoma, and/or 
other cancers. 

The data presented here point towards a novel mechanism whereby 
CDK8 is regulated by the unique histone variant mH2A. We look 
forward to future studies focused on CDK8 function and its inhibition 
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group). d, MTS assay of WM115 mH2A1_90 line co-expressing CDK8 or 
Med12 shRNAs; mean = s.d. (n = 5). e, RT-PCR of mH2A2 and CDK8 in 36 
melanoma tissues (from 30 patients; Supplementary Table 4); Pearson’s 

r= —0.406 with P = 0.014; mean = s.d. (n = 3). 


in melanoma. Our findings support emerging links between chromatin 
structure and cancer, and, for the first time to our knowledge, demon- 
strate a direct role of mH2A in this process. 


METHODS SUMMARY 

Cell culture, plasmids, infections and RNAi. Detailed information is described 
in Methods. 

Chromatin fractionation, acid extraction of histones and immunoblotting. 
Chromatin fractionation and acid extraction of histones was performed as 
described'*. Antibodies used for immunoblotting can be found in Methods. 
Quantitative mass spectrometry. Quantitative mass spectrometry was per- 
formed as described”. 

Immunohistochemistry, pathology and statistical analysis. Specimens were 
obtained from Mount Sinai School of Medicine’s Division of Dermatopathology 
(project number HSD08-00565), New York University (IRB number 10362) and 
melanoma tissue microarray (Imgenex number IMH-369). Details on staining, 
pathology and statistical analyses are described in Methods. 

Clinical specimens. Human specimens were collected at the time of surgery. 
Approval to collect melanoma specimens was granted by Mount Sinai 
Biorepository Cooperative and the New York University Interdisciplinary 
Melanoma Cooperative Group (project numbers above). Approval to collect 
benign nevi was granted by Mount Sinai School of Medicine’s Division of 
Dermatopathology (project number 08-0964). 

Bisulphite sequencing. This was performed according to the manufacturer’s 
instructions (Zymo Research). Details are described in Methods. 

Cell proliferation, migration and mouse injections. MTS was performed 
according to the manufacturer’s instructions (Promega). Colony formation and 
soft agar assays were performed as described**. The Transwell migration assay is 
described in Methods. In vivo metastasis assays were performed as described’*. For 
subcutaneous injections, 2.5 X 10° B16-F1 cells were injected into 6-week-old 
C57BL/6] mice and 2X 10° A375 cells injected into NOG mice (Jackson 
Laboratories); tumour volume measured over 14- and 20-day periods, respectively. 
Microarray hybridization, data analysis and hierarchical clustering. Microarray 
was performed using two biological replicates according to Affymetrix GeneChip 
protocol. Initial data extraction was performed at the Microarray Shared Research 
Facility at Mount Sinai School of Medicine. Heatmaps were generated using 
Cluster and Tree View programs. 

Quantitative PCR and ChIP. qPCR was performed in triplicate on Stratagene 
Opticon 2 using FastStart SYBR Green Mix (Roche). Expression levels were 
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normalized to TATA binding protein or GAPDH. ChIP assays were performed 
using a Magna ChIP Kit (Millipore) as per the manufacturer’s instructions. 


Full Methods and any associated references are available in the online version of 
the paper at www.nature.com/nature. 
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METHODS 

Cell culture, plasmids and infections. Murine B16 and human WM266-4, A375 
and HCT116 cells were maintained in DMEM supplemented with 10% fetal 
bovine serum (FBS) and 1% antibiotics. All other melanoma cells were grown 
in Tu2% media (80% MCDB 153 media, 20% Leibovitz’s L-15 media, 2% FBS, 
5 ug ml * bovine insulin, 1.68 mM CaCl,). Human melanocytes were grown in 
Medium 254 (Invitrogen). Lentiviral plasmids encoding shRNAs against murine 
and human mH2A1, mH2A2, GFP (control) and murine CDK8 were obtained 
from Open Biosystems (Thermo Fisher Scientific). Med12 and CDK8 shRNAs 
were provided by J. Espinosa and Addgene (CDK8 (ref. 7)). RNAi sequences are 
listed in Supplementary Table 1. Complementary DNA (cDNAs) encoding 
human H2A, rat mH2A1.2 and human mH2A2 were amplified and cloned into 
the retroviral vector N-Cherry-LPC (gift of M. Narita). Plasmids expressing CDK8 
(pBabe.puro.CDK8 and CDK8-KD’) were obtained from Addgene. Infections 
were performed using standard procedures. 

Chromatin fractionation, acid extraction of histones and imunoblotting. 
Chromatin fractionation and acid extraction of histones performed as described". 
Whole-cell extracts were generated by lysing cells directly in Laemmli loading 
buffer, followed by sonication (for tumour tissue), and boiled extensively. The 
following antibodies were used for immunoblotting: mH2A1 (Millipore 07-219); 
mH2A2 (ref. 11) ; H3 C-terminal (Abcam ab1791 or Millipore 05-928); H4 
(Millipore 05-858); H2A.X (Millipore 07-627); H2A.Z (Millipore 07-594); 
CDK8 (Santa Cruz sc-1521); Cited1 (Abcam ab15096); DsRed (Clontech 
632496); and Actin (Sigma A5441). 

Quantitative mass spectrometry. Bulk acid extracted histones were derivatized 
by treatment with propionyl anhydride as described”*. Histones were labelled with 
stable isotope using djo-propionic anhydride (Cambridge Isotope Laboratories). 
Online high-performance liquid chromatography separation of peptides was fol- 
lowed by liquid chromatography-mass spectrometry (LC-MS/MS) using an 
LTQ-Orbitrap mass spectrometer (ThermoFisher Scientific) as described”. All 
data were manually inspected for quantification and MS/MS interpretation. 
Two independent experiments using biological replicates were performed. 
Immunohistochemistry, pathology and statistical analysis. Specimens were 
obtained from Mount Sinai School of Medicine’s Division of Dermatopathology 
by Institutional Review Board approval (project number HSD08-00565) and New 
York University (Institutional Review Board number 10362). Primary RGP mel- 
anoma (Breslow thickness less than 1.0 mM) and VGP melanoma (Breslow thickness 
greater than 1.0 mM) were examined. IHC was also performed on BRAF-genotyped 
melanoma (New York University) and melanoma tissue microarray (Imgenex 
IMH-369). IHC was performed as per the manufacturer’s instructions (Vector 
Laboratories). In brief, 5-tum sections from formalin-fixed paraffin-embedded speci- 
mens were deparaffinized, incubated for antigen retrieval with Vector Citrate-Based 
Antigen Unmasking Solution (Vector Laboratories H-3300) in microwave for 
10 min, exposed to 0.3% hydrogen peroxide to block endogenous peroxidase activity, 
blocked with Vector Normal Horse Serum (2.5%) for 20 min, incubated with 
mH2A2 (ref. 11) (1:350-1:500) prepared in 0.1% BSA and incubated at 4 °C over- 
night. Slides were subsequently developed using Vector imPRESS Universal Kits 
anti-mouse/rabbit Ig or anti-goat Ig (Vector Laboratories MP-7500 or MP-7405), 
Vector DAB Peroxidase Substrate Kit as the chromagen (Vector Laboratories SK- 
4100) and Harris Hematoxylin (Sigma HHS32) for counterstaining. Slides were then 
sealed and mounted with Permount (Sigma SP15) and randomized for subsequent 
blinded review. Two independent dermatopathologists (P.O.E. and C.LV.) scored 
specimens for extent of melanocyte nuclear staining (0-3). Slides were compared 
with haematoxylin and eosin sections and all slides stained with H3 (Abcam ab1791 
or Millipore 05-928; both at 1:200) for tissue quality control. For CDK8 staining, 
mH2A negative melanomas were stained (Santa Cruz sc-1521, 1:200, and sc-13155, 
1:15), followed by randomization and scoring. All statistical analyses were conducted 
using SPSS 14 software (SPSS). Average staining score was used for analyses, and 
inter-observer consistency between dermatopathologists was assessed with the « 
coefficient. Statistical significance of mH2A2 scores was first assessed using the 
non-parametric Kruskal-Wallis one-way analysis of variance test, followed by 
two-sided Mann-Whitney U tests. 

Micrococcal nuclease assays. Cells were counted (Beckman Coulter particle 
counter) and evenly aliquoted (2.5 X 10° cells per MNase time point). Each sample 
treated with 1/150 unit of micrococcal nuclease (Sigma) for 2, 5, 7 or 10 min at 
37 °C, and stopped with 1 mM EGTA. Samples were centrifuged at 10,000g (for 
10 min) and DNA extracted using a DNeasy Blood and Tissue Kit (Qiagen). Equal 
amounts of DNA were resolved on 1% agarose gel and stained with ethidium 
bromide. 

Azacytidine treatment and bisulphite DNA methylation analysis. Cells were 
treated with 10 ttm 5-aza-2'-deoxycytidine and collected at days 2 and 4. Fresh 
medium containing 5-aza-2'-deoxycytidine was added at day 2 for collecting at 
day 4. For bisulphite DNA methylation analysis, DNA from cells and tissues was 


prepared with a DNeasy Blood and Tissue Kit (Qiagen). Bisulphite treatment was 
performed with an EZ DNA Methylation Kit (Zymo Research) according to the 
manufacturer’s instructions (2 ug of DNA was used in the bisulphite reaction). 
After bisulphite conversion, DNA was amplified by PCR in triplicate, pooled, 
cloned into pGEMT (Promega) and sequenced using SP6 universal primer. 
Primers used for amplifying mH2A2 promoter were as follows: M2-CG2-F: 
GTTTAGTTTTGGGGAAAGTTTTATGT; M2-CG2-R: TAAAAAAAATTACT 
CAACCTCATCC. The online tool QUMA (http://quma.cdb.riken.jp/) was used 
for bisulphite sequencing analysis”. 

Cell proliferation, soft agar and migration assays. An MTS proliferation kit was 
used according to the manufacturer’s instructions (Promega). Absorbance values 
(490 nm) were recorded on at least triplicate samples using a BIOTEK Microplate 
Reader. Colony assays performed by seeding cells at low density and allowing 
growth for 10 days. Colonies were fixed, stained with crystal violet and counted. 
Soft agar was performed essentially as described’*. Briefly, cells were plated in 
Tu2% media with 0.33% (w/v) noble agar on top of a 0.5% noble agar layer. 
After 3 weeks, colonies were stained, photographed and counted in five different 
fields using an inverted microscope. Cell migration was measured by trans-well 
assay (8-[1m pores from Corning). Cells were suspended in serum-free medium, 
and DMEM supplemented with 10% FBS used as chemoattractant. For assays with 
WM115 cells, the lower surface of the Transwell was pre-coated with fibronectin 
(Sigma, 100g ml! for 30 min at 37°C). Cells that migrated after 18h were 
stained with Diff-Quick Stain Kit (Dade Behring) and counted in five different 
fields using an inverted microscope. Wound healing assays were performed as 
described”. 

Statistics. All results are presented as the mean + s.d. or s.e.m. as indicated. 
Statistical analyses were performed by calculating P values using an unpaired 
Student’s t-test (two tailed), unless indicated otherwise. 

Flow cytometry. Cells were collected, washed in phosphate-buffered saline (PBS), 
and fixed in ice-cold 70% ethanol. Propidium iodide staining was performed using 
a Cycletest Plus Staining Kit following the manufacturer’s instructions (Becton 
Dickinson). For apoptosis studies, cells were analysed by flow cytometry by 
Annexin V staining using an Apoptosis Detection Kit (R&D Systems). 

In vivo metastasis assay and subcutaneous injections. In vivo metastasis assays 
were performed as described”’. Briefly, 2 < 10° B16-F1 cells (stably transduced 
with sh_GFP, mH2A1_91 and mH2A2_25 shRNA) and 1.5 x 10° B16-F10 cells 
(stably transduced with mCherry, H2A-mCherry, mH2A1.2- and mH2A2- 
mCherry) were injected intravenously into BALB/c mice. Mice injected with 
B16-F1 cells were killed 14 or 21 days after infection (performed in duplicate with 
similar results; 6-8 mice per group). Mice injected with B16-F10 cells were killed 
10 days after injection (n = 6-9 per group). Lungs were removed and fixed, iso- 
lated and discrete pigmented lung surface lesions were counted. For subcutaneous 
injections, 2.5 X 10° B16-F1 cells stably infected with shRNAs were injected in the 
flanks of 6-week-old C57BL/6] mice (Jackson Laboratories). Mice were injected 
(n = 6-9 per group) and measurements taken over 12-14 days. Two million A375 
cells expressing mCherry series were injected into NOG mice (NOD/Shi-scid/IL- 
2Rynull, Jackson Laboratories), and measurements taken over 21 days. Tumour 
volume was estimated by V = (a’ X b)/2, where ais the short axis and b is the long 
axis of the tumour. Tissues from all assays were paraffin-embedded and 5-1m 
sections stained with haematoxylin and eosin. Experiments were conducted under 
protocol number 080901-01 approved by the New York University Institutional 
Animal Care and Use Committee. 

Microarray hybridization and data analysis. Microarray samples were processed 
in the Microarray Shared Research Facility at Mount Sinai School of Medicine and 
performed on two biological replicates. Total RNA was isolated from cells using 
RNeasy column purification per manufacturer’s protocol (Qiagen). The quality of 
the RNA was evaluated using the Agilent BioAnalyser RNA nano assay. Briefly, 
150 ng of total RNA was reverse transcribed using T7-poly(dT) primer and con- 
verted into double-stranded cDNA. The cDNA was used as a template for sub- 
sequent in vitro transcription with biotin-labelled uridine triphosphate at 37 °C for 
16h using a Genechip 3’ IVT Express Kit (Affymetrix). The resulting biotin- 
labelled cDNA was chemically fragmented, made into hybridization cocktail 
and hybridized to the Mouse Genome 430 Plus 2.0 arrays (Affymetrix) according 
to the Affymetrix GeneChip protocol. The array images were generated through a 
high-resolution GeneChip Scanner 3000 7G (Affymetrix), then converted to digi- 
tized data based on MAS 5.0 within the GeneChip Operating Software. Spike-in 
controls and percentage of present (‘P’) call generated were used for data quality 
control. 

Data were analysed as follows. For normalization, all chip data were scaled to 
have an average signal intensity of 150. Comparison analysis based on MAS 5.0 
was performed for each pair (sh_GFP compared with mH2A1_91, and sh_GFP 
compared with mH2A2_25) on both data sets. The compared data were subjected 
to the following arbitrary filters to improve data reliability: (1) detection call—only 
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probes that had at least one ‘P’ call in the pair were retained; (2) signal intensity— 
only probes that showed signal intensity of at least 100 in at least one of the pairs 
were retained; (3) fold change—probes that showed a log-fold change of at least 2 
in at least one data set were retained for further analysis; (4) concordance analysis 
was performed to reduce false-positive selection between shRNA lines and the two 
independent microarray experiments: direction and fold-change in gene regu- 
lation had to be a 100% match to qualify as altered genes. Gene list was annotated 
by submission to Netaffx annotation centre within the Affymetrix website (http:// 
www.affymetrix.com), which periodically updates the integrated information for 
each gene across multiple public genome databases. 

Hierarchical clustering, generation of heat maps and gene ontology analysis. A 
portion of the filtered subset of data was used for additional analysis. Cluster analysis 
was performed by unsupervised hierarchical clustering on the log-transformed data 
with Gene Cluster 3.0 (http://bonsai.ims.u-tokyo.ac.jp/~mdehoon/software/ 
cluster/index.html) by using the correlation (uncentred) similarity metric and 
centeroid linkage clustering method. The resulting tree-images were visualized 
using Java TreeView. Gene ontology analysis was performed using DAVID 
Bioinformatics Resources” (http://david.abcc.nciferf.gov/home.jsp). 
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Quantitative PCR and ChIP. Total RNA was extracted using RNeasy Kit (Qiagen). 
Reverse transcription was performed with SuperScript II (Invitrogen) using oligo 
dT. qPCR reactions were performed in triplicate on Stratagene Opticon 2 using 
FastStart SYBR Green Master Mix (Roche). Expression levels were normalized to 
TATA binding protein in mouse cells and GAPDH in human cells, or relative to 
B16-F1 sh_GFP for gene target expression. Each qPCR used two independent 
biological replicates. Primer sets used for qRT-PCR are listed in Supplementary 
Table 2. ChIP assays were performed using a Magna ChIP Kit (Protein G; Millipore) 
as per the manufacturer’s instructions. Immunoprecipitations were performed with 
antibodies against mH2A1 (Millipore 07-219), H3 (Abcam ab1791) and control 
IgG (Millipore 12-370). ChIP signals were represented as the percentage of H3, 
calculated by 100 X 2°C% ~ Canes) Primers used for ChIP-qPCR are listed in 
Supplementary Table 3. 


29. Kumaki, Y., Oda, M. & Okano, M. QUMA: quantification tool for methylation 
analysis. Nucleic Acids Res. 36, W170-175 (2008). 

30. Huang, D. W., Sherman, B. T. & Lempicki, R. A. Systematic and integrative analysis 
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Planar polarized actomyosin contractile flows 
control epithelial junction remodelling 


Matteo Rauzi/}, Pierre-Francois Lenne! & Thomas Lecuit! 


Force generation by Myosin-II motors on actin filaments drives cell 
and tissue morphogenesis’. In epithelia, contractile forces are 
resisted at apical junctions by adhesive forces dependent on 
E-cadherin”, which also transmits tension®’”"’. During Drosophila 
embryonic germband extension, tissue elongation is driven by cell 
intercalation”, which requires an irreversible and planar polarized 
remodelling of epithelial cell junctions*’. We investigate how cell 
deformations emerge from the interplay between force generation 
and cortical force transmission during this remodelling in 
Drosophila melanogaster. The shrinkage of dorsal-ventral-oriented 
(‘vertical’) junctions during this process is known to require planar 
polarized junctional contractility by Myosin II (refs 4, 5, 7, 12). Here we 
show that this shrinkage is not produced by junctional Myosin II itself, 
but by the polarized flow of medial actomyosin pulses towards 
‘vertical’ junctions. This anisotropic flow is oriented by the planar 
polarized distribution of E-cadherin complexes, in that medial 
Myosin II flows towards ‘vertical’ junctions, which have relatively less 
E-cadherin than transverse junctions. Our evidence suggests that the 
medial flow pattern reflects equilibrium properties of force transmis- 
sion and coupling to E-cadherin by o-Catenin. Thus, epithelial mor- 
phogenesis is not properly reflected by Myosin II steady state 
distribution but by polarized contractile actomyosin flows that emerge 
from interactions between E-cadherin and actomyosin networks. 

The planar polarized remodelling of cell junctions** that occurs 
during germband extension (GBE) is shown in Fig. la. Myosin II 
(Myo-II) is concentrated in ‘vertical’ junctions*”' and directs junction 
shrinkage by increasing junctional tension”’*. To understand how 
Myo-II planar polarity is established, we investigated changes in 
Myo-II distribution at the onset of GBE. We used a fusion between 
Myo-II regulatory light chain (MRLC, called Sqh in Drosophila) and 
Cherry (MRLC-Cherry)’* together with E-cad-GFP to mark adherens 
junctions (AJs). When the epithelium is formed, MRLC-Cherry is 
visible in aggregates in the medial region of AJs (Fig. 1b). Sub- 
sequently, MRLC-Cherry is also detected at the cortex of AJs of inter- 
calating cells (Fig. 1b). An MRLC-GFP fusion rescuing a null sgh“? 
mutant (Fig. 1c) and an antibody against endogenous Myo-II heavy 
chain (not shown) displayed the same features. Thus two Myo-II 
populations exist during cell intercalation: a medial and a junctional 
pool (Supplementary Fig. 1). Labelling of F-actin with Utrophin-GFP 
(Utr-GFP) shows a network spanning the AJs (Fig. 1d, Supplementary 
Fig. 1, Supplementary Movie 1a). This network is thin (<500 nm), and 
contains filaments at low density (mesh size 0.5-2 tm) that overlap 
and intersect in the form of brighter puncta, which are more apparent 
in a slightly less apical focal plane intersecting the AJs (Supplementary 
Movie 1b, Supplementary Fig. 1). Thus, both Myo-II pools are part ofa 
large-scale actomyosin network, spanning multiple cells, which con- 
trasts with previous descriptions focused on junctional actin and Myo- 
II (refs 4, 5, 7, 12, 18, 21-23). 

Live imaging of Utr-GFP and MRLC-GFP indicated complex 
dynamics (Supplementary Movies la, b and 2). The F-actin mesh 
fluctuated, with the mesh changing size in a few tens of seconds 
(Fig. 1d and Supplementary Movie la, b). Myo-II formed small clusters 


(presumably Myo-II minifilaments), which coalesced into large 
(~1 pm) medial aggregates on similar timescales (Fig. le, Supplemen- 
tary Movie 2). Co-imaging of Utr-GFP and MRCL-Cherry revealed 
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Figure 1 | Two pulsating pools of acto-myosin in intercalating cells. 

a, Polarized junction shrinkage during cell intercalation. A, P, D and V denote 
respectively dorsal, ventral, anterior and posterior. b, Localization of Myo-II and 
E-cad before and during intercalation. ¢, Respective distribution of medial (red) 
and junctional (green) Myo-II along the apico-basal (z) axis. d, e, Apical F-actin 
coalesces locally (d, magnified in right panels, arrows), while medial Myo-II 
clusters (e, magnified in right panels, arrowheads). f, Myo-II pulses in the medial 
(red) and junctional (green) regions. g, Average junctional Myo-II (dark green) 
and linear fits for different junctions. h, Temporal cross-correlation of the curves in 
f. Ris the correlation coefficient. i, Evolution of junctional length. Scale bars, 5 jim. 
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that actin and Myo-II coalesce together during aggregation (Sup- 
plementary Fig. 2a, Supplementary Movie 3; Methods), reflecting local 
and transient contractions within the actomyosin network, as also 
reported in the Drosophila mesoderm and the one-cell stage 
Caenorhabditis elegans cortex'*™*. 

To further investigate the functions of the medial and junctional Myo- 
II networks, we monitored their temporal evolution during intercalation 
(Fig. 1f-i). Both medial and junctional Myo-II, respectively in the vicinity 
of and at shrinking junctions, fluctuated in intensity (Fig. 1f). In addition 
to being pulsed, the intensity of junctional Myo-II gradually increased 
(Fig. 1g). Meanwhile, the changes in ‘vertical’ junction length are irre- 
gular, showing successive steps of shrinkage and arrest (Fig. li). In some 
cases, however, transient relaxation was observed (17.6%, N = 17). 

To disentangle this complex behaviour and relate contractile 
dynamics of medial and junctional networks with junction shrinkage, 
we conducted temporal cross-correlation of fluorescence intensity 
(Online Methods). Correlation between temporal profiles of MRLC- 
GFP intensity at the junctions and in the medial regions is high (mean 
<R> = 0.86, Fig. 1h), indicative of similar overall dynamics. Moreover, 
medial pulses precede junctional pulses by 8 + 4s (mean = s.d. here- 
after, Fig. 1h and Supplementary Fig. 3). 

We then compared rates of junction shrinkage with rates of MRLC- 
GFP intensity changes (Fig. 2a left), which correspond to local accumu- 
lations of Myo-II by contraction (Fig. le). The maximum of the MRLC- 
GFP contraction rate in the medial region precedes that of junctional 
MRLC-GFP by an average of 10.5 + 2.5 s (Fig. 2b left). Thus contraction 
of Myo-II occurs in the medial region first and subsequently at junctions 
(Fig. 2b left, right). Each step of junction shrinkage was associated with 
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tandem medial and junctional Myo-II pulses (Fig. 2a right, horizontal 
braces). Temporal cross-correlation indicated that the peak rate of junc- 
tion shrinkage precedes that of junctional Myo-II accumulation by 
9+ 3s (Fig. 2b left), indicating that junctional Myo-II accumulation 
cannot cause the shrinkage steps. However, peak junction shrinkage 
rate temporally coincided with the peak rate of medial Myo-II contrac- 
tion (Fig. 2b left, right) suggesting a mechanical contribution to shrink- 
age increments. 

To test this, we used laser nanodissection”* to locally disrupt medial 
Myo-II clusters at the vicinity of shrinking junctions. Each ablation 
pulse produced a collapse of the Myo-II pulse and a transient and 
reversible relaxation of junction length without affecting junctional 
Myo-II (Fig. 2c left, right, Supplementary Movie 4). Thus, medial 
Myo-II mechanically causes junction shrinkage. This led us to investi- 
gate the function of junctional Myo-II pulses, as previous studies 
showed it was essential for global junction shrinkage*”””. Close inspec- 
tion reveals two situations: (1) in most cases (88%, N = 17), medial 
Myo-II pulses are followed by junctional pulses, and shrinkage steps 
proceed successfully without relaxation (14/15 cases, Fig. 2a right); (2) 
occasionally, (12%) medial pulses are not followed by junctional pulses 
and shrinkage steps relax in all cases (Fig. 2d). Relaxation correlates 
with failure to sustain junctional Myo-II and with an overall decrease 
of Myo-II at junctions (Fig. 2d left, right). This suggests that junctional 
Myo-II stabilizes junction length. 

Together these observations point to a mechanical ‘division of 
labour’, where medial Myo-II pulses shrink, and sustained junctional 
Myo-II accumulation stabilizes, junction length. This iterative cycle 
ensures persistent shrinkage. 


Figure 2 | Medial and junctional Myo-II pools 
have different mechanical roles. a, Left: cartoon 
depicting a vertical junction (length /) and regions 
where medial and junctional Myo-II are measured. 
Right: evolution of junction length and Myo-II 
intensities (I). Brackets show clusters of events and 
dashed lines represent the rates of changes. b, Time 
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These observations suggested that both processes may be spatially 
coordinated. Indeed, medial pulses show a planar polarized distri- 
bution like junctional Myo-II. Defining four quadrants (anterior, A; 
posterior, P; dorsal, D; and ventral, V; diagram in Fig. 3a right) in the 
medial region of cells, we determined the integrated intensity ratio of 
(A+P)/(D+V) MRLC-GFP in time series (Fig. 3a left). Intercalating 
germband cells exhibit a significant medial Myo-II polarity compared 
to non-intercalating head cells or to germband cells of Kriippel(Kr) 
RNAi embryos where planar cell polarization is affected*”® (Fig. 3a left, 
Supplementary Movie 6). 

We next investigated the spatial dynamics of medial and junctional 
actomyosin networks. Co-imaging of Utr-GFP and MRLC-Cherry 
and particle imaging velocimetry (PIV) indicated that F-actin and 
Myo-II have very similar dynamics and that actomyosin clusters flow 
in the plane of the medial region (Fig. 3b, Supplementary Movie 3). 
Myo-II was moving slightly (22%) but consistently faster than F-actin 
(Supplementary Fig. 5), in agreement with the idea that Myo-II is 
responsible for flow. 
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Figure 3 | Medial Myosin-II displays anisotropic flow and feeds ‘vertical’ 
junctions. a, Left: histograms of medial Myo-II relative intensities in (A+P) 
regions over (D+V) regions (see diagram at right for nomenclature) in the 
germband of wild-type (WT) and Kriippel (Kr) RNAi embryos, and the head of 
WT embryos. WT/KrRNAi: P = 0.0007, WT(germband)/WT (head): 

P= 0.001 (T-student). Right; representative images of cells with MRCL- 
Cherry and E-cad-GFP. b, Comparative PIV of Utr-GFP and MRLC-Cherry 
in a cell outlined in red. Blue dots mark vector tips. c, Medial Myo-II flowing to 
a vertical junction. Tracking of speckles is showed in coloured lines (right). 
d, Left: a medial cluster (red, arrowhead) flows and fuses to the junctional Myo- 
II pool (green); right: corresponding quantification. Scale bars, 5 um. 
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Tracking of Myo-II speckles (Fig. 3c, Supplementary Movies 2, 5) or 
of F-actin with Myo-II (Supplementary Fig. 2b) indicated that the 
polarized distribution of medial Myo-II results from the lateral flow 
of medial pulses towards ‘vertical’ junctions. In KrRNAi embryos, this 
movement occurred randomly (Supplementary Movie 6), consistent 
with the loss of medial Myo-II polarity (Fig. 3a left). 

The polarized flow of Myo-II (0.11 + 0.03 pm s ') could either 
reflect a movement of Myo-II minifilaments or the propagation of 
contractile waves. We tested these alternatives by photobleaching 
medial MRLC-GFP clusters. The fluorescence recovery in the 
bleached area (recovery fractions 34+ 10% (N=5), t12=4+1s, 
Supplementary Fig. 6a, b, d) was low compared to the junctions (recovery 
fractions ~70% (ref. 8), not shown). Moreover, no new cluster appears 
in the vicinity of bleached pulses, as would be expected for contractile 
waves (Supplementary Fig. 6a). Together this indicates that medial 
flows correspond to the movement of relatively stable Myo-II fila- 
ments. Fluorescence recovery after photobleaching (FRAP) experi- 
ments with Utr-GFP show extensive (83 + 22%, N=5) turnover 
within <3 s (Supplementary Fig. 6c, d), suggesting that the actomyosin 
flow is mainly determined by Myo-II contractility on a fast-recycling, 
‘permissive’ actin substrate. 

We then addressed whether medial pulses are transferred to the 
junctional cortex and cause the formation of junctional pulses. As 
medial MRLC-GFP is slightly (500-1,000 nm) more apical than junc- 
tional MRLC-GFP, confocal sections distinguished the two pools and 
showed fusion of medial Myo-II (red) to the cortex and formation of a 
junctional pulse (green) (Fig. 3d left, right; Supplementary Movie 7). 
No transfer of medial pulses occurred to the adjacent junction follow- 
ing their ablation (Fig. 2c right). Moreover, photobleaching of MRLC- 
GFP along a junction (Supplementary Fig. 7a, b; Supplementary Movie 
8) indicates two sources of exchange: pre-existing Myo-II patches are 
rapidly and strongly recovered (72 + 6%), consistent with previous 
reports’; new junctional patches form de novo where medial Myo-II 
clusters fuse with junctions. 

Junctional Myo-II pulses are delayed by ~8 s relative to medial ones 
(Supplementary Fig. 3), reflecting a speed of transfer of ~0.125 ums’, 
which is similar to the direct flow speed measurements (0.11 + 
0.03 um s~'). 

Thus, medial and junctional actomyosin networks have tightly 
coordinated and hierarchically organized mechanical functions. 
Medial pulses flow to and produce steps of shrinkage of the adjacent 
‘vertical’ junctions. They subsequently fuse with junctions and sustain 
junctional Myo-II accumulation, which stabilizes junction length. This 
flow and transfer are planar polarized, and drive junctional planar 
polarity and cell intercalation. 

What controls the planar polarized flow of medial Myo-II pulses to 
vertical junctions? Mechanical anchoring of actomyosin networks at 
AJs is essential for force production during cell morphogenesis®'”"”*. 
The medial network is also potentially connected to the apical plasma 
membrane given its tight apposition (Supplementary Fig. 8). Imaging 
of the apical plasma membrane with palmitoylated YFP (GAP43- 
Venus) revealed however a flat apical surface in the medial part of 
intercalating cells with few, small protrusions (Supplementary Fig. 9, 
Supplementary Movie 9), unlike apically constricting mesoderm cells 
where the plasma membrane is strongly ruffled (Supplementary Movie 
10). These protrusions display local jitter but no aggregation or flow 
patterns characteristic of the underlying actomyosin network, suggest- 
ing moderate coupling (Supplementary Fig. 9, Supplementary Movie 
9). Co-imaging of GAP43-Cherry and Utr-GFP shows that small 
protrusions and F-actin had un-correlated trajectories (Supplemen- 
tary Movie 11) or moved at different speeds (Supplementary Movie 12; 
3.7-fold reduced lateral dynamics (0.03 + 0.015 ums ', Supplemen- 
tary Fig. 9) compared to MRLC-GFP or Utr-GFP (0.11 + 0.03 um 
st, Fig. 3c)). Therefore, the apical surface and the medial actomyosin 
network are in contact but moderately coupled. 
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This suggested that the anisotropic actomyosin flow may largely 
depend on the distribution of junctional anchoring points. This 
requires E-cadherin/B-Catenin complexes at AJs and depends on 
a-Catenin'®”*, E-cadherin/#-Catenin/ax-Catenin complexes are planar 
polarized’ (not shown), such that medial pulses flow towards regions 
with lower amounts of E-cadherin complexes. The level of E-cadherin 
along ‘vertical’ relative to adjacent junctions (E-cadherin anisotropy, 
Fig. 4a left) is also fluctuating (Fig. 4a middle). Moreover, the onset of 
medial pulses coincided with the time when E-cadherin anisotropy 
reached a local maximum (Fig. 4a middle, right) raising the possibility 
that E-cadherin anisotropy may orient the actomyosin flow. Reduction 
of E-cadherin by RNAi causes the disappearance of medial Myo-II 
(Fig. 4b top, c top; Supplementary Movies 13, 14). The junctional 
Myo-II level is consequently strongly reduced and no longer planar 
polarized (Fig. 4b bottom, c bottom). We reasoned that reducing the 
levels of a-Catenin by RNAi should attenuate coupling more subtly. 
ot-Catenin RNAi reduces the number of E-cadherin clusters at AJs and 
disrupts interactions with junctional F-actin'®. Moreover, the distri- 
bution of E-cadherin is no longer planar polarized in «-CateninRNAi 
embryos (Fig. 4e, Supplementary Fig. 10). This is associated with a loss 
of medial (Fig. 4f, Supplementary Movie 15) and junctional (Fig. 4d 
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top, bottom) Myo-II planar polarity. Thus, the planar polarized dis- 
tribution of E-cadherin/B-Catenin/x-Catenin complexes biases the 
flow of medial Myo-II and junctional polarization. 

In addition to Myo-II contractility, flow requires (1) crosslinkers 
between filaments to transmit tension within the medial meshwork, 
and (2) coupling at the cortex to E-cadherin/#-Catenin/x-Catenin 
complexes. Increased levels of E-cadherin in ‘transverse’ junctions 
may change properties of the actin network (for example, crosslinking/ 
viscosity) and inhibit internal transmission of contractile forces and 
hence prevent D-V oriented flow. To test this, we disrupted the force 
balance within the medial actomyosin network by focal ablation (Fig. 4g 
top, bottom), and imaged the redistribution of medial clusters. If 
increased E-cadherin levels at transverse junctions inhibit tension 
transmission along the D-V axis, then medial pulses should not flow 
in this direction following ablation. However, we observed that Myo-II 
medial clusters flowed radially and away from the point of ablation 
towards the junctions (velocity v= 0.05 + 0.01 um s *) in 100% of 
cases (N= 25), even towards transverse junctions (12/25 cases, 
Fig. 4g top, bottom; Supplementary Fig. 11; Supplementary Movie 
16). Focal ablation of the actin meshwork produces a local hole, which 
expands radially (Supplementary Movie 17). This argues that transverse 


Figure 4 | E-cadherin planar polarity orients 
medial Myosin-II flow. a, Left and middle: medial 
MRLC-Cherry average intensity (red) and E-cad-— 
GFP polarity (blue) as a function of time. E-cad— 
GFP polarity is the ratio of its mean intensity in 
transverse (J,) and vertical (J,) junctions. Right: 
chronology of events taking as a reference medial 
Myo-II intensity maximum. Delays between events 
are obtained by correlation; shown are mean and 
s.d. The difference is in black. b-d, Top row: Myo- 
II and E-cad in control (b), e-cad RNAi (c) and 
o-cat RNAi (d) embryos. Bottom row: average 
intensity of junctional Myo-II as a function of the 
angle (0) of the junctions with respect to the A/P 
axis. e, Top: Comparison between normalized 
E-cad—GFP average intensity (=(J, — T)/I) of 
transverse versus vertical junctions for water 
injected (blue) and «-cat RNAi embryos (orange); 
P values are shown (Student’s T test). Jj, mean 
intensity at a junction; J, mean intensity of all 
junctions in a cell. Diagram at bottom indicates the 
angles of vertical and transverse junctions with 
respect to the A-P axis. f, Histogram of average 
medial Myo-II intensity as in Fig. 2a, right, for «-cat 
RNAi embryos. WT/a-cat RNAi: P = 0.0006 
(Student’s T test). g, Bottom: movement of a Myo- 
II cluster (white arrowhead) following nearby focal 
ablation (red arrowhead). Top right: diagram 
showing the centrifugal directions of the 
trajectories followed by Myo-II clusters (N = 25). 
6(°) Scale bars, 5 um. 
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junctions do not inhibit flow per se and that flow directionality emerges 
from the properties of the actomyosin meshwork integrated over the 
entire apical surface. 

The mechanical properties of the medial actomyosin network are 
locally defined by Myo-II contractility (concentration, affinity, duty 
cycle), tension transmission within the network (crosslinking), and 
viscous resistance to deformations (interactions between filaments)”””*. 
Moreover, these properties fluctuate owing to protein turnover and 
interactions. E-cadherin is known to anchor'*”® and modify actin 
dynamics”’*®. Our results suggest that the polarized distribution of 
E-cadherin may control the actomyosin flow pattern by spatially 
modulating mechanical properties of the actin network. 

Current models of epithelial morphogenesis centre on Myo-II 
steady state distribution and associated contractile forces'?4°71017"4), 
Our data show however that cell deformations cannot be simply 
derived from the Myo-II distribution itself, but from two central fea- 
tures of actomyosin dynamics, namely concentration (pulses) and 
movement (flow). Pulsed dynamics defines the rhythm and possibly 
the speed of deformation. Flow pattern, which in the case of intercala- 
tion is anisotropic, dictates the orientation of cell deformation (Sup- 
plementary Fig. 12). Flows of Myo-II foci have been reported in the 
one-cell stage C. elegans embryo”, pointing to a more general property 
of actomyosin networks'*~*. An important future avenue of research will 
be to investigate what properties of actin networks control Myo-II flow 
dynamics in different systems. 


METHODS SUMMARY 

Mutants and constructs. To visualize Myosin-II we used MRLC fused to eGFP or 
mCherry and rescuing a protein null sqh“*? mutant. The following stocks were used: 
sqh’?; sqh-MRLC::GEP (II) and sqh**?; ubi-e-cad::GFP, sqh- MRLC::mCherry. The 
plasmid coding for the fusion of eGFP and the actin binding domain of human 
Utrophin was obtained from W. Bement. The Utr-GFP clone was cloned by PCR in 
a pUASp destination vector (Fig. 1, Supplementary Fig. 1, Supplementary Movies 1a, 
b) or under the sgh promoter (Fig. 3, Supplementary Fig. 2, Supplementary Movie 3). 
The construct were verified by sequencing. To label the plasma membrane, we used a 
fusion between the palmitoylated GAP43 protein and YFP/Venus expressed by the 
GAL4/UAS system with the maternal tubGAL4VP16 driver line. GAP43—Cherry 
was expressed under the sgh promoter. 

RNA interference. We generated by PCR dsRNA probes directed against Kriippel, 
a-catenin and e-cadherin as described in refs 4, 18. 

Time-lapse imaging. Embryos were prepared and imaged using a spinning disc 
confocal system (Perkin Elmer) on an inverted Nikon microscope with 100% oil 
immersion objective. Nano-ablation was performed using a home-built set-up’’. 
Fluorescence recovery after photobleaching (FRAP) measurements were per- 
formed as in Supplementary Fig. 6 using a confocal LSM510 (Zeiss) with a 
Plan-Apochromat 100X oil objective and an argon laser (488 nm). 

Image analysis and quantifications. Intensity measurements, cross-correlation 
analysis, time-delays analysis and PIV analysis are detailed in Online Methods and 
in Supplementary Figs 3, 4. 


Full Methods and any associated references are available in the online version of 
the paper at www.nature.com/nature. 
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METHODS 


Fly stocks and constructs. Drosophila MRLC is encoded by spaghetti-squash 
(sqh). All experiments visualizing dynamics of MRLC were looking at MRLC fused 
to either eGFP or mCherry under the sgh promoter and rescuing a protein null 
sgh“? mutant®!. E-cad~GFP was expressed under the ubiquitin promoter ubi- 
Ecad::GFP and rescues a null e-cad/shotgun mutant”. 

The following fly stocks were used. Figure 1, Supplementary Fig. 1 and 
Supplementary Movies la, b: matGAL4(67) UASp-Utr-GFP (recombinant on 
Il). sqh?; sqh-MRLC::GFP (II) (generous gift of R. Karess) and sqh*?; ubi-E- 
cad::GEP, sqh-sqh::mCherry (recombinant on II). Figure 3, Supplementary Fig. 2, 
Supplementary Movie 3: sq**? ; sqh-MRLC::mCherry, sqh-Utr::GFP (recombinant 
on II). sqh-MRLC::mCherry (on II) is a gift from A. Martin and E. Wieschaus. 

The plasmid coding for the fusion of eGFP and the actin binding domain of 
human Utrophin was obtained from W. Bement*’. The Utr-GFP fusion was PCR 
amplified and inserted in the p221IDONR GATEWAY plasmid (Invitrogen). The 
fusion was recombined in a pUASp GATEWAY destination vector (pPW, from T. 
Murphy, Carnegie Institute) for expression under the maternal tubGAL4VP16 
driver line (67Gal4) in Fig. 1, Supplementary Fig. 1 and Supplementary Movies la, 
b, or expression under the sgh promoter in Fig. 3, Supplementary Fig. 2 and 
Supplementary Movie 3. In the latter case, Utr-GFP expression is lower than 
under the Gal4 system, hence only reveals brighter structures (puncta) also visible 
in Supplementary Movie 1a and b and Fig. 1d and not individual filaments contrary 
to Fig. 1d. 

To label the plasma membrane we used a fusion between the palmitoylated 
GAP43 protein and the YFP variant Venus™ expressed by the GAL4 UAS system 
with the maternal tubGAL4VP16 driver line. GAP43-Cherry was constructed 
similarly and expressed under the sgh promoter as in ref. 31. 

RNAi interference. We generated by PCR dsRNA probes directed against 
Kriippel, a-catenin, and e-cadherin using the following primers. The underlined 
sequence is the T7 promoter. The sequence not underlined corresponds to the 
template sequence. e-cadherin: 533 nucleotides, between +1475 to +2008 from 
ATG. E-cad-T7-F, taatacgactcactatagggagaccacgagtctctttgataatggcgagce. E-cad- 
T7-R, taatacgactcactatagggagaccaceggtttccatcgttctggtgaatc. o-catenin: 728 nucleo- 
tides, between +81 to +808 from ATG. o-Cat-T7-F, taatacgactcactatagggcac 


aatgtcagttgaaaaaacacttg. “-Cat-T7-R, taatacgactcactatagggettggeatgactttccttgggc 
aac. Kriippel: 775 nucleotides, between +491 to +1266 from ATG. Kr-T7-F, 


taatacgactcactatagggagaccacggagtttcagaccgagatcagca. Kr-T7-R, taatacgactcactat 
agggagaccacagagctggctccatcttcagaca. Embryos were injected as described in ref. 35. 
Time lapse imaging. Embryos were prepared and imaged as detailed in ref. 36, 
using a spinning disc confocal system (Perkin Elmer) on an inverted Nikon 
microscope with 100X/1.4 oil immersion objective. 

Nano-ablation experiments. We performed nano-dissection experiments with a 
home-built system. A near-infrared (NIR, 1,030nm) femtosecond (fs) laser at 
50 MHz repetition rate (t-Pulse, Amplitude Systems) was coupled to an inverted 
microscope (Eclipse TE 2000-E, Nikon). A fast multicolour confocal imaging 
system, based on the Yokogawa spinning disk (Ultraview ERS, Perkin Elmer), 
was also mounted at a side port of the microscope. Local ablation and fast fluor- 
escence imaging were thus possible simultaneously. The NIR-fs laser beam is 
expanded through a X5 telescope and is aligned with the microscope optical path 
with a dichroic mirror (FF01-750/SP, Semrock) immediately below the objective 
lens (X60/1.2, water immersion, Plan Apo VC, Nikon). The collimated beam fills 
the back aperture of the objective lens which transmits 68% of the incoming NIR 
light. Nano-dissections of medial Myo-II were performed by exposing this struc- 
ture to the tightly focused laser during 1-3 ms with an average power of 360 mW at 
the back aperture of the objective. Exposure time was controlled by an automated 
1.5-mm-diameter mechanical shutter (LS2, Uniblitz). The sample was positioned 
over the tightly focused laser beam thanks to a computer-controlled mechanical 
stage (Scan IM with a Tango2-Desktop controller, Marzhatiser). A very similar set- 
up has already been shown to allow sub-cellular ablations’. 

Fluorescence intensity measurements. The intensity of the medial Myo-II is 
defined as the sum of average intensities of two regions of interest (ROIs) close 
to the junction (the centre of the elliptical ROIs were ~1 um away from the 
junction, Fig. 2a in red). The intensity of the junctional Myo-II is defined as the 
average intensity of a 500-nm-wide stripe along the junction (Fig. 2a in green). The 
E-cad anisotropy is the average intensity of a 500-nm-wide stripe along transverse 
junctions divided by the average intensity measured along the vertical junction 
(Fig. 4a in blue). Intensity measurements were made by using ImageJ (1.39p 
version). Analysis were done on time lapse movies (one frame every 1-3 s). For 
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each frame, 6-10 z-planes were imaged over 3 jum. For long time lapse imaging 
(>200s), bleach correction was performed by using Image]. 

Cross-correlation analysis. Cross-correlation was performed applying Igor Pro 
(Wavemetrics) cross-correlation function. This function is given by: 


i 


c(e)= | fogte-+ eat 
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where T represents the overall time over which measurements were made, f(t) and 
g(t) the two cross-correlated functions (taking fas reference), and t the time delay. 

The basal signal finin and gmin Were subtracted from f and g functions respecti- 
vely before cross-correlation. The final cross-correlation function was normalized 
as follows: 
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Time delay measurements. In Fig. 2b all time delays were measured by cross- 
correlation. Cross-correlation analysis was assessed by performing a measure of 
delays between peaks for each cluster of events (for example, Fig. 2 shows three 
clusters of events) (see Supplementary Figs 1 and 2 top middle panel). When 
correlating with contraction rate functions, curves were smoothed by using a 
binomial algorithm implemented in Igor Pro software. For this analysis five cases 
of fully intercalating cells (corresponding to 15 clusters of events) were taken from 
five different wild-type MRLC-GFP embryos. Time lapse movies were taken at a 
rate of 1 frames '. Each frame consisted in a z-stack of 3 um (images spaced by 
500 nm). Time lapse ranged between 200 and 500s. Time delays in Fig. 4a, right, 
were determined as follows. The delay between the E-cad anisotropy peak and the 
medial Myo-II intensity peak was measured by cross-correlation (the medial Myo- 
Il intensity curve was taken as reference). The time onset of medial Myo-II intensity 
pulses with respect to medial Myo-II intensity peak maxima was determined from 
the autocorrelation of the medial Myo-II intensity, which provides a measure of the 
average pulse duration, and therefore a measure of the average delay between pulse 
onset and pulse intensity peak. Auto-correlation analysis was assessed by perform- 
ing a measure of delays for each cluster of events (see Supplementary Fig. 2 bottom) 
as for cross-correlation analysis. For this analysis five cases of intercalating cells 
were taken from five different wild-type MRLC-Cherry / E-cad-GFP embryos. 
Time lapse movies (one frame every 3s) of both MRLC-Cherry and E-cad-GFP 
were taken. Each frame consisted of a z-stack of 3 j1m (images spaced by 500 nm). 
Time lapse ranged between 200 and 500s. Igor Pro software was used for all time 
delay measurements. 

PIV analysis. PIV was determined with the Mathlab toolbox (procedure MatPIV) 
developed by J. K. Sveen. 

Fluorescence recovery after photobleaching. Fluorescence recovery after photo- 
bleaching (FRAP) measurements were performed as in Supplementary Fig. 6 using 
a confocal LSM510 (Zeiss) with a Plan-Apochromat 100X/1.3 oil objective and an 
argon laser (488 nm). Before and after photobleaching, images were acquired at 
low laser power (0.1% AOTF, Acousto Optic Tunable Filter) to avoid bleaching 
and with a pixel size of 40 nm. Photobleaching was performed for 0.9 s at full laser 
power over an ROI with 1 jum diameter. Fluorescence recovery was then recorded 
for 50s. In Supplementary Fig. 7, we used the photokinesis unit of a Perkin Elmer 
confocal system for FRAP and the region of interest is a line 5 um long. 
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Endothelial nitric oxide synthase (eNOS) is critical in the regulation 
of vascular function, and can generate both nitric oxide (NO) and 
superoxide (O,”  ), which are key mediators of cellular signalling. In 
the presence of Ca”*/calmodulin, eNOS produces NO, endothelial- 
derived relaxing factor, from L-arginine (L-Arg) by means of elec- 
tron transfer from NADPH through a flavin containing reductase 
domain to oxygen bound at the haem of an oxygenase domain, 
which also contains binding sites for tetrahydrobiopterin (BH,4) 
and 1-Arg’*. In the absence of BH4, NO synthesis is abrogated 
and instead O,"" is generated*’. While NOS dysfunction occurs 
in diseases with redox stress, BH, repletion only partly restores NOS 
activity and NOS-dependent vasodilation’. This suggests that there 
is an as yet unidentified redox-regulated mechanism controlling 
NOS function. Protein thiols can undergo S-glutathionylation, a 
reversible protein modification involved in cellular signalling and 
adaptation®’. Under oxidative stress, S-glutathionylation occurs 
through thiol-disulphide exchange with oxidized glutathione or 
reaction of oxidant-induced protein thiyl radicals with reduced 
glutathione’"’. Cysteine residues are critical for the maintenance 
of eNOS function’””’; we therefore speculated that oxidative stress 
could alter eNOS activity through S-glutathionylation. Here we 
show that S-glutathionylation of eNOS reversibly decreases NOS 
activity with an increase in O,° generation primarily from the 
reductase, in which two highly conserved cysteine residues are iden- 
tified as sites of S-glutathionylation and found to be critical for 
redox-regulation of eNOS function. We show that eNOS S- 
glutathionylation in endothelial cells, with loss of NO and gain of 
O," generation, is associated with impaired endothelium-dependent 
vasodilation. In hypertensive vessels, eNOS S-glutathionylation is 
increased with impaired endothelium-dependent vasodilation that 
is restored by thiol-specific reducing agents, which reverse this 
S-glutathionylation. Thus, S-glutathionylation of eNOS is a pivotal 
switch providing redox regulation of cellular signalling, endothelial 
function and vascular tone. 

We observed that oxidized glutathione (GSSG) induces dose- 
dependent S-glutathionylation of human eNOS (heNOS) that was 
reversed by reducing agents, such as 2-mercaptoethanol or dithiothreitol 
(DTT) (Fig. la). S-Glutathionylation greatly decreased NOS activity 
(Fig. 1b) in a dose-dependent manner (Supplementary Fig. 1), but this 
was reversed by DTT with more than 80% recovery. When accessible 
thiols were alkylated by N-ethylmaleimide (NEM), NOS activity was 
abolished (more than 95% decrease; Fig. 1b). As expected, the NOS 
activity of control, S-glutathionylated or S-alkylated heNOS was totally 
inhibited by the NOS inhibitor 1-N°-nitroarginine methyl ester 
(L-NAME). In contrast to the marked (more than 70%) loss of NOS 
activity with S-glutathionylation, only a 56% decrease in NADPH con- 
sumption was seen that was only partly inhibited by L-NAME or the 
Ca’* chelator EGTA (Supplementary Fig. 2). Although thiol-alkylation 
abolished NOS activity, it decreased NADPH consumption by only 


about 50%, and this was not inhibited by L-NAME or EGTA. Thus, thiol 
modification uncouples eNOS with electron leakage from the reductase. 

Because electron leakage could trigger 0." generation, electron para- 
magnetic resonance (EPR) spin trapping was performed to demonstrate 
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Figure 1 | S-glutathionylation of heNOS occurs and inhibits NOS activity. 
a, Immunoblotting of heNOS S-glutathionylation. Top: immunoblotting of 
protein S-glutathionylation (PrS-SG) with anti-GSH antibody. Control non-S- 
glutathionylated heNOS (1 j1g in 20 pl) or heNOS S-glutathionylated by 0.5, 1 or 
2mM GSSG at room temperature (23 °C) for 1h. Treatment with 
2-mercaptoethanol (ME) after S-glutathionylation with 2 mM GSSG reversed 
the S-glutathionylation. Bottom: immunoblotting with anti-eNOS antibody. 

b, Effect of S-glutathionylation and S-alkylation on heNOS activity. NOS activity 
was measured from control, S-glutathionylated (2 mM GSSG for 20 min) or 
alkylated (1 mM NEM for 20 min) heNOS. NOS activity of treated or untreated 
heNOS was fully inhibited by - NAME (1 mM) or EGTA (1 mM). ¢, d, Effects of 
S-glutathionylation on O,"~ generation from heNOS. O,"~ generation was 
measured from control, S-glutathionylated (as in b) or alkylated (as in b) BH,- 
bound heNOS by EPR spin trapping with 25 mM 5-diethoxyphosphoryl-5- 
methyl-1-pyrroline N-oxide (DEPMPO). c, Spin-trapping showed no signal in 
the absence of heNOS and only trace signal from control enzyme; 
S-glutathionylation triggered a marked increase in O,"~ generation with a O07" - 
adduct spectrum that was quenched by Cu,Zn superoxide dismutase (SOD) 
(200 U ml’). d, Effect of L-NAME (1 mM) and EGTA (1 mM) on O,"~ 
generation from control, S-glutathionylated and alkylated BH4-bound heNOS. 
Results in b and d are shown as means and s.e.m. (1 = 3-5). 
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this S-glutathionylation-dependent O,°~ generation from heNOS. 
S-Glutathionylation greatly increased 0," generation (more than five- 
fold) with a prominent O,"” -adduct signal that was quenched by Cu,Zn 
superoxide dismutase (Fig. 1c). The NOS inhibitor L-NAME, which 
blocks O,"” generation from the oxygenase, only partly blocked this 
O,"~ generation (Fig. 1d), and it was also incompletely blocked by 
EGTA. S-Alkylation of heNOS increased O,°” generation (about four- 
fold), and this was not blocked by L-NAME or EGTA. In contrast, the 
low-level 0." production from control heNOS was fully quenched by 
L-NAME or EGTA. Thus, S-glutathionylation and S-alkylation uncouple 
heNOS, greatly increasing O2"” generation, and the partial or complete 
lack of inhibition by L-NAME suggests that the observed 0," is largely 
derived from the reductase domain. 

To investigate the mechanism of S-glutathionylation-induced 
heNOS uncoupling, we sought to determine the specific residues modi- 
fied. We therefore subjected S-glutathionylated heNOS to proteolytic 
digestion and liquid chromatography-tandem mass spectrometry 
(LC-MS/MS) analysis. Peptides with a mass difference of 305 Da, 
representing one glutathione moiety, were detected by LC-MS and 
their primary sequence was determined by MS/MS. We identified 
two glutathionylated cysteine residues within the reductase domain, 
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Figure 3 | Effect of redox stress on eNOS S-glutathionylation and function 
in endothelial cells. a, Immunostaining of eNOS (left column, green 
fluorescence) and S-glutathionylation (second column, red fluorescence) in 
control BAECs and cells preincubated with BCNU. The third column shows the 
merged S-glutathionylation/eNOS image along with 4’,6-diamidino-2- 
phenylindole (DAPI) staining of the nucleus (blue). eNOS staining and 
S-glutathionylation seem to co-localize. The right-hand column shows O,"~ 
detection with dihydroethidine (DHE), which is oxidized by O.°~ to a product 
with red fluorescence, and the cell nuclei were counterstained with DAPI (blue). 
Increased O,°~ generation was seen in the BCNU-treated cells. Bottom: 
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Figure 2 | Cysteine mutants (C689S, C908S and 
C689S/C908S) of heNOS resist S-glutathionylation 
and secondary uncoupling. WT heNOSandheNOS 


e C6898, C908S and C689S/C9I08S mutants were 

8 treated with 2 mM GSSG. a, Percentage loss of NOS 
activity after treatment of heNOS with GSSG. 

6 b, Percentage increase in O2"~ generation after 
treatment of heNOS with GSSG. c, Ratio of relative 

fF eNOS S-glutathionylation to eNOS protein. The 

2 relative intensity of eNOS S-glutathionylation/ 
eNOS protein was normalized to the wild-type 

0 value. The Cys—Ser mutants maintained a NOS 


RS g g & Ke activity similar to that of the wild type 
C SF CF (120 + 12nmolmin “mg °). Results are shown as 
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namely Cys 689 and Cys 908, from both trypsin and chymotrypsin diges- 
tions (Supplementary Fig. 3a, b). Using molecular modelling to predict 
the three-dimensional structure of the heNOS reductase domain 
(Supplementary Fig. 4), we found that Cys 689 and Cys 908 are located 
on the domain surface surrounded by several positively charged residues, 
and thus would probably be deprotonated at physiological pH, making 
them good candidates for S-glutathionylation. 

S-Glutathionylation results in the formation of a mixed disulphide 
bond between the reactive Cys-thiol and reduced glutathione (GSH), a 
tripeptide consisting of glycine, cysteine and glutamate. The addition 
of this bulky negatively charged group can alter protein structure and 
function in a similar manner to the addition of a phosphate'*’*. Our 
molecular modelling reveals that both Cys 689 and Cys 908 are located 
at the interface of the FAD-binding and FMN-binding domains. 
Modification of these residues would therefore disrupt FAD-FMN 
alignment, interrupting electron transfer between the flavins and 
enhancing their solvent accessibility’® (Supplementary Fig. 4), so that 
O, could gain access and accept an electron from the reduced flavin, 
with the formation of O," . 

Mutagenesis of Cys 689 or Cys908 to Ser was used to test the 
importance of these residues on the redox regulation of eNOS. 
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immunoprecipitation of eNOS from the BCNU-treated cells shows that eNOS 
S-glutathionylation occurs but is reversed by 1 mM DTT. The upper row shows 
immunoblotting with anti-GSH antibody; the lower row shows immunoblotting 
with anti-eNOS antibody. b, Effects of eNOS silencing from BAECs on BCNU- 
induced O,"~ generation. Top: immunoblotting against eNOS to determine the 
efficiency of eNOS silencing. Middle: confocal microscopy of eNOS (upper row) 
and O,°~ measurements with DHE (lower row). NOS3 short interfering RNA 
(siRNA) greatly decreases eNOS expression in BAECs (middle column). 
Bottom: graph of the effect of eNOS silencing on BCNU-induced O,°~ 
generation. Results are shown as means and s.e.m. (m = 5). Asterisk, P< 0.001. 
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Whereas wild-type (WT) heNOS is S-glutathionylated by GSSG with a 
roughly 70% loss of NOS activity, Cys—Ser mutants resist glutathio- 
nylation, with no loss of NOS activity in the double mutant and only 
modest loss in the single mutants (Fig. 2a—c). These mutants also resist 
GSSG-induced eNOS uncoupling with 0," generation. Thus, both 
Cys 689 and Cys 908 are critical for the redox regulation of eNOS. 

Next we sought to determine the consequences of eNOS 
S-glutathionylation in endothelial cells. Inhibition of glutathione reduc- 
tase by 1,3-bis(2-chloroethyl)-1-nitrosourea (BCNU) decreases the cel- 
lular GSH/GSSG ratio, leading to protein S-glutathionylation’””’. 
Previous studies of bovine aortic endothelial cells (BAECs) treated with 
BCNU reported eNOS inhibition with glutaredoxin or thioredoxin 
inactivation’®”®; however, the molecular mechanism of this process 
and alterations in eNOS were not investigated. In our current study of 
BAECs treated with BCNU, confocal microscopy demonstrated marked 
cellular S-glutathionylation that co-localized with eNOS (Fig. 3a, left 
columns). Immunoprecipitation of eNOS followed by immunoblotting 
confirmed that the BCNU-induced increase in GSSG led to eNOS 
S-glutathionylation (Fig. 3a, bottom). This was further confirmed by 
mass spectrometry, in which Cys689 was more than 50% S- 
glutathionylated (Supplementary Fig. 5). BCNU-treated BAECs 
showed increased O,° generation that was blocked by DTT, which 
reversed the eNOS S-glutathionylation (Fig. 3a, right column, and 
Supplementary Fig. 6). BCNU also dose-dependently decreased cellular 
eNOS-derived NO production (Supplementary Fig. 7). Thus, altera- 
tions in the cellular GSH/GSSG ratio led to the S-glutathionylation of 
eNOS, and this resulted in decreased NO and increased 0," genera- 
tion. eNOS gene silencing from BAECs abolished BCNU-induced O02" 
generation (Fig. 3b). Experiments in COS7 cells transfected with WT or 
C689A/C908A eNOS confirmed that glutathionylation at Cys 689/ 
Cys 908 is critical for the triggering of BCNU-induced 0," generation 
(Supplementary Fig. 8). 

To further determine whether redox stress leading to S- 
glutathionylation alters endothelial function in vessels, aortic segments 
were pre-exposed to BCNU and then measurements of endothelium- 
dependent or endothelium-independent relaxation were performed. In 
BCNU-exposed vessels, a marked decrease in endothelium-dependent 
vasodilation was seen (Fig. 4a, left panel), whereas endothelium- 
independent vasodilation elicited by exogenous NO was unaffected 
(Fig. 4a, right panel). Furthermore, DTT, which reverses eNOS 
S-glutathionylation, restored endothelium-dependent vasodilation 
in BCNU-treated vessels. 

Oxidant-stress-induced disruption of endothelium-dependent vasodi- 
lation is involved in the pathogenesis of hypertension, atherosclerosis and 
other cardiovascular disease”’. Because eNOS S-glutathionylation pro- 
foundly impaired endothelium-dependent vasodilation, we speculated 
that there might be an increase in eNOS S-glutathionylation in hyperten- 
sion. Indeed, in the vessels of spontaneously hypertensive (SHR) rats en 
face immunohistology showed marked S-glutathionylation with promi- 
nent endothelial co-localization with eNOS (Supplementary Fig. 9), 
whereas control normotensive vessels (from WKY rats) had little 
S-glutathionylation. Immunoprecipitation of eNOS confirmed these 
results, showing much higher eNOS S-glutathionylation in vessels from 
SHR rats in comparison with vessels from WKY rats (Fig. 4c). The 
marked decrease in endothelium-dependent vasodilation of aortic rings 
from SHR rats was reversed by thiol-specific reducing agents that con- 
currently reverse eNOS S-glutathionylation (Fig. 4b, c). Thus, just as in 
the in vitro and ex vivo settings, eNOS S-glutathionylation occurs in 
vessels in vivo and increases with oxidative stress, resulting in a loss of 
endothelium-dependent relaxation, leading to hypertension. Other redox 
modifications of critical thiols on eNOS or other important regulatory 
proteins could further contribute to vascular dysfunction and the patho- 
genesis of hypertension”. 

There is extensive evidence that thiols potentiate eNOS activity and 
alleviate oxidant stress’***. NOS uncoupling induces oxidant stress and 
has previously been shown to occur with depletion of L-Arg or BH, and 
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Figure 4 | Effect of redox stress on eNOS S-glutathionylation and function 
in vessels. a, Endothelium-dependent and endothelium-independent 
vasorelaxation in control and BCNU (80 .M)-treated rat aortic rings. BENU 
markedly decreased endothelium-dependent relaxation to acetylcholine (left 
panel) but not endothelium-independent relaxation by the NO donor 
NONOate (right panel). DTT (1 mM for 20 min) reversed the BCNU-induced 
inhibition of relaxation (left panel). Aortic relaxation is plotted as the 
percentage decrease in phenylephrine (PHE)-induced contraction against 
agonist concentration on a logarithmic scale. Results are shown as 

means + s.e.m.; P< 0.05, BCNU versus control or BCNU + DTT (n = 4). 

b, Endothelium-dependent vasorelaxation in spontaneously hypertensive 
(SHR) and control (WKY) aortic rings. SHR rings showed a marked decrease in 
relaxation to acetylcholine; however, DTT (as above) re-established the 
acetylcholine response. Endothelium-independent relaxation (right) was 
similar for both SHR and WKY rings. Aortic relaxation is expressed as in a. 
P<0.05, SHR versus WKY or SHR + DTT (n = 4). See also Supplementary 
Fig. 10. c, eNOS S-glutathionylation of SHR and WKY aortae. Top: WKY and 
SHR aortae, either untreated or DTT-pretreated as in b, were homogenized. 
This was followed by immunoprecipitation with anti-eNOS antibody. The 
immunoprecipitation products were separated by SDS-PAGE followed by 
immunoblotting against anti-GSH and anti-eNOS antibodies. In SHR aortae, 
eNOS S-glutathionylation was markedly increased compared with WKY aortae 
and was abolished by pretreatment with DTT. Bottom: ratio of relative intensity 
of eNOS S-glutathionylation/eNOS, normalized to SHR aortae. There is only 
trace eNOS S-glutathionylation in WKY aortae, whereas high levels are seen in 
SHR aortae. There was no detectable (n.d.) NOS S-glutathionylation in DTT- 
pretreated WKY or SHR aortae. Asterisk, P< 0.001 versus SHR (n = 5). 
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elevation of methylarginine levels*****. Here we show that eNOS pos- 
sesses specific redox-sensitive thiols that are readily S-glutathionylated 
in endothelial cells and vessels with marked endothelial dysfunction 
and hypertension. This oxidative modification switches eNOS from its 
classical NO synthase function to that of an NADPH-dependent 
oxidase generating O°”, which occurs primarily from the reductase 
domain and, in contrast to other uncoupling mechanisms, is not 
inhibited by typical NOS inhibitors. Because NO and 0," have 
many opposing roles in cell signalling and vascular function”, 
S-glutathionylation of eNOS will trigger profound changes in cellular 
and vascular function and will mediate redox-signalling under oxid- 
ative stress. This mechanism of eNOS uncoupling could be triggered 
by other uncoupling processes such as BH, depletion, but could also 
further enhance BH, depletion. Further studies will be needed to 
elucidate these interactions. 

These observations provide a new molecular understanding of how 
oxidant stress alters endothelial function and vascular tone and how 
the restoration or supplementation of reducing equivalents can restore 
endothelial function and normalize vascular tone. Therapeutics with 
thiol-reducing properties can therefore now be developed and refined 
as potent drugs for reversing endothelial dysfunction and ameliorating 
hypertension and other cardiovascular disease. Recently, hydrogen 
sulphide, a potent reducing agent, has been identified as a critical 
endogenous signalling molecule conferring potent cardiac protection 
in diseases with oxidant stress*°; however, its mechanism of action is 
unknown. Our present observations provide a mechanism by which it 
might confer protection. 

S-Glutathionylation thus uncouples eNOS, switching it from NO to 
O,*~ generation. This process is induced by oxidant stress and is revers- 
ible. Two highly conserved cysteine residues at the interface between the 
FMN-binding and FAD-binding domains are S-glutathionylated, lead- 
ing to uncoupling with 0," generation. Oxidant stress triggers eNOS 
S-glutathionylation in endothelial cells and intact vessels. Furthermore, 
S-glutathionylation is increased in hypertensive vessels, resulting in 
impaired endothelium-dependent vasodilation. In view of the central 
importance of NO and eNOS-mediated endothelial dysfunction in dis- 
eases including heart attack, stroke, diabetes and cancer, identification 
of this novel redox-signalling pathway provides new insights into thera- 
peutic approaches for the prevention or amelioration of many of the 
most prevalent diseases afflicting mankind. 


METHODS SUMMARY 


heNOS was expressed, purified and characterized as described'*. EPR spin-trapping 
and fluorescence were used to measure NO and O,°~ generation. Immuno- 
fluorescence microscopy and immunoprecipitation were applied to detect eNOS 
S-glutathionylation in BAECs and aortae. Specific cysteine residues of eNOS that 
are S-glutathionylated were identified by mass spectrometry, and site-directed 
mutagenesis was performed to determine their role in enzyme function in vitro 
and in vivo. Acetylcholine-dependent relaxation of aortic rings was used to deter- 
mine endothelium-dependent vasodilator function. 


Full Methods and any associated references are available in the online version of 
the paper at www.nature.com/nature. 
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METHODS 

heNOS purification. heNOS was purified from an Escherichia coli overexpression 
system in which plasmids expressing heNOS (pCWheNOS or pDEST17heNOS) 
and calmodulin (pCaM) were co-transformed into BL21(DE3). The detailed 
expression procedures have been described previously'**". 

Determination of protein and haem content. Protein concentration of purified 
heNOS was determined by the Bradford protein assay (Bio-Rad), with BSA as a 
standard. The haem content of heNOS was determined by pyridine haemochromogen 
assay. heNOS (50 1g) was added to a solution containing 0.15 M NaOH and 1.8M 
pyridine, and the difference spectrum (reduced minus oxidized bispyridine haem) was 
recorded and quantified by using Ac = 24 mM~' cm‘ at 556-538 nm. Reduction of 
the bispyridine haem was achieved by the addition of a few grains of dithionite’>”’. 
Thiol modification of heNOS. To induce protein S-glutathionylation in vitro, 
purified heNOS was incubated for 20 min with the specified GSSG concentration 
in 50 mM Tris-HCl pH 7.4 at room temperature”. To alkylate all accessible thiols 
on heNOS, the purified heNOS was incubated for 20 min with 1mM NEM in 
50mM Tris pH7.4 at room temperature. Thiol modified enzyme preparations 
were then subjected to further analysis: immunoblotting, NO activity assay, and 
NO and O,"~ measurement*"*°***”, For mass spectrometric identification of sites 
of S-glutathionylation, heNOS was incubated for 1h with 2mM GSSG at room 
temperature and then subjected to SDS-PAGE separation under non-reducing 
conditions. The molar ratio of eNOS to GSSG was 1:250 when 2 mM GSSG was 
used for the reaction. 

Measurement of NOS activity. NOS activity was measured by the conversion of 
L-['*C]arginine to L-[‘“C] citrulline in a total volume of 200 il of buffer containing 
50 mM Tris-HCl pH 7.4, 100 1M 1-Arg, 1 pM t-['*C]arginine, 0.5 mM NADPH, 
0.5mM Ca**, 10 tgml! calmodulin, 10 1M BH, and 5 pg ml’ purified eNOS. 
After incubation for 10 min at 37 °C, the reaction was terminated by the addition 
of 3 ml ofice-cold stop buffer (20 mM HEPES pH 5.5,2 mM EDTA, 2mM EGTA). 
L-[“*C]Citrulline was separated by passing reaction mixtures through Dowex AG 
50W-X8 (Na* form; Sigma) cation-exchange columns and quantified by liquid 
scintillation counting™. 

Measurement of NADPH consumption. NADPH oxidation* was followed 
spectrophotometrically at 340 nm with a Varian Cary 300 UV-Vis spectropho- 
tometer. The reaction mixture (total volume 500 ul) contained 10 ug of CaM, 
100 1M L-Arg, 200 uM NADPH, 10 uM BHy and 500 uM CaCl, in 50 mM Tris- 
HCl pH7.4. heNOS (2-5 tig) was used in the NADPH consumption assay. The 
reaction was initiated by the addition of 10 .l of 10 mM NADPH, and all experi- 
ments were run at room temperature. The rate of NADPH oxidation during the 
first 10 min was followed and the initial rate was calculated from the linear portion 
and an extinction coefficient of 6.22 mM‘ cm". 

Measurement of O,"~ generation by EPR spin trapping. Spin-trapping mea- 
surements of oxygen radical production from heNOS were performed in 50 mM 
Tris-HCl buffer pH 7.4 containing 0.5 mM NADPH, 0.5mM Ca?*, 10 ig ml! 
calmodulin, 15 pg ml! purified heNOS and 25mM DEPMPO’>”’. For these 
measurements the binding of BH, to heNOS was reconstituted in advance by 
incubation of the enzyme with 100 uM BH, for 3h; the unbound BH, was then 
removed to prevent superoxide scavenging. EPR spectra were recorded in a 50-p1l 
capillary at room temperature with a Bruker EMX spectrometer operating at 
9.86GHz with 100kHz modulation frequency at room temperature, as 
described’. Spectra were measured by using the following parameters: centre field 
3,510 G; sweep width 140G; power 20 mW; receiver gain 2 10°; modulation 
amplitude 0.5 G; conversion time 41 ms; time constant 328 ms. 

SDS-PAGE and immunoblotting. The standard procedures for SDS-PAGE and 
immunoblotting were followed as described previously’*. The reaction mixture 
was separated on a 4—20% Tris-glycine polyacrylamide gradient gel. Samples were 
run at room temperature for 1.5h at 125 V. Protein bands were transferred elec- 
trophoretically to a nitrocellulose membrane in 12 mM Tris-HCl, 96 mM glycine, 
20% methanol with an Xcell II Blot Module (Invitrogen) with 25 V constant for 
90 min. Membranes were blocked for 1h at room temperature in Tris-buffered 
saline (TBS) containing 0.05% Tween 20 (TTBS), with 5% dried milk (Bio-Rad). 
Membranes were then incubated overnight with anti-glutathione monoclonal 
antibody (ViroGen) or anti-eNOS polyclonal antibody (Santa Cruz) at 4°C. 
Membranes were then washed three times in TTBS and incubated for 1h with 
horseradish peroxidase-conjugated anti-mouse or anti-rabbit IgG in TTBS at 
room temperature. Membranes were again washed three times in TTBS and were 
then detected with ECL Western Blotting detection reagents (Amersham 
Biosciences). The signal intensity of blotting was digitized and quantified with 
ImageJ from the National Institutes of Health. 

Mass spectrometry. The protein sample was subjected to SDS-PAGE on a 4-20% 
gradient polyacrylamide gel. Protein bands on the gel were then stained with 
Coomassie blue. The band containing S-glutathionylation of heNOS, which was 
confirmed by immunoblotting against anti-GSH antibody, was cut and digested 
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in-gel with trypsin, chymotrypsin, or trypsin and chymotrypsin before mass spec- 
trometric measurement”. 

The S-glutathionylation of heNOS was determined with capillary-liquid chro- 

matography tandem mass spectrometry (Nano-LC-MS/MS), which was per- 
formed on a LTQ or LTQ Orbitrap mass spectrometer (Thermo). The detailed 
parameters used in the MS measurements have been described in our previous 
study**. Sequence information from MS/MS data was processed with Mascot 
Distiller software, by using standard data processing parameters. Database 
searches were performed with the MASCOT (Matrix Science) program. 
Modelling. The three-dimensional structure of heNOS reductase domain was 
predicted by use of the Swiss-Model First Approach Mode”. The input sequence 
of heNOS starts from Ala 515 to Ser 1177 of heNOS. The lower Blast P(N) limit for 
template selection was set to 0.00001. The three-dimensional structure of the 
reductase domain of rat neuronal NOS (PDB ID 1F20) was also used as the 
self-input template file for the tertiary structure prediction of the heNOS reductase 
domain. The final model output was a Swiss-PDB viewer project file. Py MOL 
(DeLano Scientific LLC) was used to construct and view the three-dimensional 
structure of the heNOS reductase domain. 
Site-directed mutagenesis of heNOS. For bacterial expression, the human NOS3 
gene was subcloned into pDEST17 vector (Invitrogen). It contains a His tag at the 
amino terminus of heNOS. The reading frame and heNOS sequence were con- 
firmed by DNA sequencing. QuikChange site-directed mutagenesis (Stratagene) 
was used for heNOS Cys—Ser mutations. Primers for each mutation were as 
follows: Cys 689—>Ser, 5’-GGCGACGAGCTGAGCGGCCAGGAGG-3’ (sense) 
and 5'-CCTCCTGGCCGCTCAGCTCGTCGCC-3’ (antisense); Cys 908—Ser, 
5'-GAAGTGGTTCCGCAGCCCCACGCTGC-3’ (sense) and 5’-GCAGCGTGG 
GGCTGCGGAACCACTTC-3’ (antisense). The sequence of each heNOS mutant 
was further confirmed by DNA sequencing. The detailed procedures of protein 
expression and purification have been described previously'**'. 

For mammalian expression, the human NOS3 gene was subcloned into pcDNA- 

DEST40 gateway vector (pc40heNOS) (Invitrogen). It contains the V5 epitope and 
a His tag at the carboxy terminus of heNOS. The reading frame and heNOS 
sequence were confirmed by DNA sequencing. QuikChange site-directed muta- 
genesis (Stratagene) was used for heNOS Cys—Ala mutations. Primers for each 
mutation were as follows: Cys 689— Ala, 5’-GGCGACGAGCTGGCCGGCCAG 
GAGG-3' (sense) and 5’-CCTCCTGGCCGGCCAGCTCGTCGCC-3’ (antisense); 
Cys 908—Ala, 5’-GAAGTGGTTCCGCGCCCCCACGCTGCTG-3’ (sense) and 
5'-CAGCAGCGTGGGGGCGCGGAACCACTTC-3’ (antisense). The sequence 
of the heNOS double mutant (C689A/C908A) from pc40heNOS was further con- 
firmed by DNA sequencing. For mammalian expression, COS-7 was used. for 
heNOS overexpression for cellular assays, because there is no eNOS in COS-7, as 
reported previously”’. 
Fluorescence and immunofluorescence microscopy. BAECs cultured on 22-mm?* 
sterile coverslips (Harvard Apparatus) in 35-mm sterile dishes at a density of 10* 
cells per dish were subjected to treatment with BCNU for 4 h. BCNU, an inhibitor of 
glutathione reductase, has been shown to alter cellular redox environment, leading 
to an increased GSSG/GSH ratio. The increase in oxidized GSH leads to increased 
cellular S-glutathiolation. At the end of the experiment, cells attached to coverslips 
were washed with PBS, fixed for 10 min with 3.7% paraformaldehyde and permea- 
bilized for 5 min with 0.25% Triton X-100 in Tris-buffered saline containing 0.01% 
Tween 20 (TBST), washed three times, and then blocked for 30 min with 1% BSA in 
TBST. Permeabilization is required to provide access for the antibody to the antigen 
throughout the cell. Permeabilization and washing is also critical for the detection of 
protein-bound GSH adducts, because it clears the free GSH that would otherwise 
bind the antibody. For detection of S-glutathionylation and eNOS, the fixed and 
permeabilized cells were incubated at room temperature for 1 h with mouse anti- 
GSH and rabbit anti-eNOS primary antibodies at a dilution of 1:2,000 in TBST 
containing 1% BSA, followed by secondary anti-mouse Alexa fluor-568 and anti- 
rabbit Alexa fluor-488-conjugated antibody (1:1,000 dilution) for 1 h at room tem- 
perature. The coverslips with cells were then mounted on a glass slide with 
Fluoromount G mounting medium and viewed with a Olympus FluoView- 1000 
confocal microscope at X60 magnification, and data were captured digitally and 
analysed. 

To detect O2°~ generation from S-glutathionylated eNOS in BAECs, cells were 
then incubated with the O,°~ indicator 10 uM DHE to detect O,°~ in live cells. 
DHE fluoresces when oxidized by 02°”. Nuclei were stained with blue-fluorescent 
DAPI (1 1M) for 10 min in the incubator. After incubation, cells were washed with 
PBS and mounted; images were captured and analysed at X60 magnification by 
confocal fluorescence microscopy, and overlaid with LSM software’. 

En face sections. After surgery the aortae from WKY and SHR rats were cleaned 
and washed with ice-cold PBS. A slit was made longitudinally and the opened 
aortae were fixed in 3.7% paraformaldehyde for 3h at room temperature. The 
fixed aortae were washed for 2-3 h in 0.1 M cacodylate buffer and then incubated 


©2010 Macmillan Publishers Limited. All rights reserved 


LETTER 


overnight with 2.3 M sucrose gradients titrated for 10 min each with 5% sucrose in 
cacodylate buffer (2:1, 1:1, 1:2 and 1:3) and 2.3 M sucrose at 4 °C. The samples were 
then mounted in OCT medium and frozen in liquid nitrogen*®. Tissues were 
cryosectioned en face from anterior to posterior and the sections were probed 
for eNOS and PrS-SG. The sections were permeabilized with 0.25% Triton X-100 
for 10min and washed, followed by immunostaining for eNOS/PrS-SG as 
described above. High-magnification images were obtained and analysed with 
an Olympus FluoView-1000 confocal microscope at X 100 original magnification. 
EPR spin-trapping measurement of NO production. Spin-trapping measure- 
ments of NO from BAECs were performed with a Bruker EMX spectrometer with 
Fe-N-methyl-p-glucamine dithiocarbamate (Fe-MGD) as the spin trap****. Spin- 
trapping experiments were performed on cells grown in six-well plates (10° cells 
per well). Before EPR spin-trapping measurements, control cells or cells treated 
with BCNU were washed twice with PBS (without CaCl; or MgCl). Next, 0.8 ml of 
PBS containing glucose (1g]~'), CaCl,, MgCl,, the NO spin-trap Fe-MGD 
(0.5mM Fe*", 5.0mM MGD) and calcium ionophore (1 |tM) was added to each 
well, and the plates were incubated for 20 min at 37°C in a humidified envir- 
onment containing 5% CO /95% O>. After incubation, the medium from each well 
was removed, and the trapped NO in the supernatants was quantified by EPR. 
Spectra recorded from these cellular preparations were obtained with the following 
parameters: microwave power 20 mW; modulation amplitude 4.0 G; modulation 
frequency 100 kHz. 

Aortic preparations and functional measurements. Aortae were excised from 
anaesthetized and heparinized rats, placed in ice-cold buffer, cleaned of loosely 
adhering fat and connective tissue and cut into rings 5 mm in length for measure- 
ments of vascular tone as described previously, with minor modification”. 

In brief, aortic rings were mounted horizontally and connected to an isometric 
force transducer in organ chambers (Multi Wire Myograph, Model 610M; DMT) 
filled with 5 ml of Krebs-Henseleit (K-H) buffer (37°C, pH 7.4) consisting of 
118mM NaCl, 46mM KCl, 1.2mM CaCl, 1.2mM NaH,PO,, 24mM 
NaHCO, 18 mM glucose, 10 1M indomethacin and 4.6 mM HEPES bubbled with 
95% O2/5% CO>. The aortic segments were allowed to equilibrate for 60 min with 
an initial tension of 1 g. The stability of each ring was checked by the successive 
administration of 4 M KCl. Preparations were then washed three times with drug- 
free oxygenated K-H buffer and allowed to relax fully for 15min before the 
experimental protocol began. Then the aortic rings were contracted with pheny- 
lephrine (10 11M) and, after stable contraction, the vasorelaxant effects of cumu- 
lative addition of acetylcholine, NONOate or sodium nitroprusside (SNP) were 
determined by measuring the tension and expressing this as the percentage relaxa- 
tion with respect to the maximal phenylephrine contraction. To induce 
S-glutathionylation, rings were pretreated with 80 1M BCNU during the 60 min 
equilibration period, and to reverse S-glutathionylation the rings were treated with 
1mM DTT for 20 min. Similarly, to reverse the intrinsic S-glutathionylation pre- 
sent, SHR or WKY aortae or aortic rings were treated with 1 mM DTT for 20 min 
at 37 °C. For immunoprecipitation studies measuring the effect of DTT in revers- 
ing eNOS glutathionylation in aortae, DIT was added to the bioassay blood 
vessels, taking care to employ exactly the same conditions with the same duration 
of incubation and concentration of DTT as in the experiments measuring 
endothelial function. We then washed out the DTT and immunoprecipitated 
eNOS from the vessel homogenates. 

For the BCNU studies, male Sprague-Dawley rats (Harlan) were used. Male 
SHR and WKY rats were supplied by Harlan or Charles River. 

NOS3 gene silencing in bovine aortic endothelial cells. NOS3 gene silencing 
from BAECs was used to confirm eNOS S-glutathionylation induced by BCNU 
contributing to increased cellular superoxide generation. The sequence of NOS3 
siRNA was based on a previous study’. The sense siRNA strand of eNOS is 5’-GA 
GUUACAAGAUCCGCUUCTT-3’ and the antisense siRNA is 3’-TTCUCAAU 


GUUCUAGGCGAAG-5’. These siRNAs were custom synthesized by Invitrogen. 
BLOCK-iT transfection kit (Invitrogen) was used to deliver NOS3 siRNAs to 
BAECs. Scrambled siRNA was used as a negative control. After 48 h, eNOS immu- 
noblotting was used to determine the eNOS knockdown efficiency. The same set of 
BAECs was further treated with 80 1M BCNU for 4h, and confocal microscopy 
with DHE was used to determine BCNU-induced superoxide generation. 
Homogenization of cells and tissues. Cells or tissues were homogenized in lysis 
buffer (50 mM Tris-HCl pH 7.4, 500 mM NaCl, 1% Nonidet P40, 0.5% sodium 
deoxycholate, 0.1% SDS, 50mM NEM, and protease inhibitors) with a tissue 
grinder. After homogenization, cell or tissue debris was removed by centrifugation 
at 4°C for 1h. The supernatant was further used for immunoblotting analysis or 
immunoprecipitation assay. 

Immunoprecipitation assay. Supernatant of cell or tissue lysate was used for the 
immunoprecipitation assay. Agarose-conjugated anti-eNOS (Santa Cruz) antibody 
was first incubated overnight with cell or tissue lysate at 4°C. After incubation, 
eNOS immunoprecipitation product was washed three times with cold PBS buffer. 
1 X SDS loading buffer was used to elute eNOS for immunoblotting analysis of 
eNOS glutathionylation. 

Statistical analysis. All data are expressed as means and s.e.m. All experiments 
were repeated at least three times. Microsoft Excel and Origin were used for data 
analysis. Student’s t-test was used for statistical analysis, with P<0.05 being 
considered significant. 
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Interaction of pathogens with cells of the immune system results in 
activation of inflammatory gene expression. This response, 
although vital for immune defence, is frequently deleterious to 
the host due to the exaggerated production of inflammatory proteins. 
The scope of inflammatory responses reflects the activation state of 
signalling proteins upstream of inflammatory genes as well as signal- 
induced assembly of nuclear chromatin complexes that support 
mRNA expression’~*. Recognition of post-translationally modified 
histones by nuclear proteins that initiate mRNA transcription and 
support mRNA elongation is a critical step in the regulation of gene 
expression’ "°. Here we present a novel pharmacological approach 
that targets inflammatory gene expression by interfering with the 
recognition of acetylated histones by the bromodomain and extra 
terminal domain (BET) family of proteins. We describe a synthetic 
compound (I-BET) that by ‘mimicking’ acetylated histones disrupts 
chromatin complexes responsible for the expression of key inflam- 
matory genes in activated macrophages, and confers protection 
against lipopolysaccharide-induced endotoxic shock and bacteria- 
induced sepsis. Our findings suggest that synthetic compounds 
specifically targeting proteins that recognize post-translationally 
modified histones can serve as a new generation of immunomodu- 
latory drugs. 

BET proteins BRD2, BRD3 and BRD4 (hereafter defined as BET) 
govern the assembly of histone acetylation-dependent chromatin 
complexes that regulate inflammatory gene expression” *. This func- 
tion of BET suggests the possibility of intervention with inflammatory 
gene expression by disrupting chromatin complexes essential for 
mRNA transcription, elongation and splicing. 

The diversity of binding surfaces, created by differences in sequences 
surrounding the bromodomain acetyl-binding pocket of BET, and 
other bromodomain-containing proteins, provided a foundation for 
selective pharmacological targeting of BET?'’"*. Using an approach 
that uses the ability of synthetic compounds to bind selectively to 
individual proteins in cell lysates (see Supplementary Material), we 
identified compounds that interact with BET. One of these com- 
pounds, GSK525762A (Fig. la), henceforth referred to as I-BET, 
showed the highest affinity interaction with BET (Fig. 1). The crystal 
structure of I-BET bound to BRD4-bromodomain 1 (BD1) showed 
I-BET positioned at the acetyl-lysine (AcK)-binding pocket (Fig. 1b 
and Supplementary Fig. 1a, b). Hydrogen bonding interactions essen- 
tial for binding of AcK to asparagine 140 and tyrosine 97 within the 
bromodomain was mimicked by the triazoyl ring of I-BET (Fig. 1b). 
The selectivity of I-BET interaction with BET was determined by the 
ZA hydrophobic channel and WPF shelf outside of the AcK binding 
pocket, where a conserved isoleucine or valine impose spatial con- 
straints on the size of molecules that can gain access to the WPF 
shelf (Fig. 1b and Supplementary Fig. 1b, c). Indeed, an enantiomer 


compound of I-BET (GSK525768A) had no activity towards BET 
(Fig. 1c, far right panel). The structural features of I-BET allow two 
molecules of I-BET to bind to the tandem bromodomains of BET with 
high affinity (dissociation constant Kq of 50.5-61.3 nM; Fig. 1c and 
Supplementary Fig. 1d, e). Moreover, I-BET could successfully com- 
pete with AcK within the recognition pocket of BET. Fluorescence 
resonance energy transfer (FRET) analysis demonstrated that I-BET 
displaced, with high efficacy (half-maximum inhibitory concentration 
ICs of 32.5-42.5 nM), a tetra-acetylated H4 peptide that had been pre- 
bound to tandem bromodomains of BET (Fig. 1d and Supplementary 
Fig. le). I-BET is highly selective as it did not interact with other 
bromodomain-containing proteins from each arm of the phylogeny 
tree (Supplementary Fig. 1f) and had no activity towards a panel of 38 
unrelated proteins (Supplementary Table 1). 

Stimulation of bone marrow-derived macrophages (BMDMs) with 
lipopolysaccharide (LPS) upregulated numerous inflammatory genes 
(Fig. 2a). Pre-treatment of BMDMs with I-BET shortly before LPS 
stimulation resulted in the downregulation of 38 and 151 of the LPS- 
inducible genes at 1 and 4 h, respectively (Fig. 2a, b and Supplementary 
Table 2). I-BET suppressed the expression of key LPS-inducible cyto- 
kines and chemokines, including 16, Ifnb1, II1b, I112a, Cxcl9 and Ccl12. 
The inhibitory effect of I-BET on the expression of the IL-1 proces- 
sing enzyme Mefv"* underscored the potential of I-BET to control the 
IL-1B inflammatory circuit. Furthermore, diminished expression of 
transcription factors Rel, Irf4 and Irf8 point to the ability of I-BET to 
curtail the initial wave of inflammatory gene expression (Fig. 2b and 
Supplementary Table 2). In the absence of LPS stimulation, treatment 
of BMDMs with I-BET had a marginal effect on gene transcription 
(Supplementary Fig. 3 and Supplementary Table 3) and did not have 
an impact on the expression of T/r1-13, Myd88, Ticam1, Cd14, Mapk, 
Mapk2k, Map3k and Map4k family members, Ikbkb, Ikbke and Ikbkg 
and Aoah'®, in unstimulated or LPS-treated macrophages at 1 h, that 
control LPS sensing and _ signalling (Supplementary Fig. 4). 
Furthermore, an unaltered pattern of LPS-induced ERK phosphoryla- 
tion and IkBo degradation in I-BET-treated cells excluded the impact 
of I-BET on gene expression through dysregulation of Toll-like recep- 
tor 4 (TLR4)-dependent signalling (Supplementary Fig. 5). I-BET also 
had no effect on the expression of housekeeping genes or the viability 
of BMDMs (Supplementary Figs 4 and 5). The impact of I-BET on 
LPS-inducible gene expression is highly selective. The cytokine Tnf as 
well as chemokines Ccl2-5, Cxcl1/2 were not affected by I-BET 
(Supplementary Table 2 and Supplementary Fig. 6). This specificity 
and anti-inflammatory potential of I-BET has been validated by the 
similarity between the effects of I-BET treatment and siRNA-mediated 
BET knockdown on inflammatory gene expression (Supplementary 
Fig. 7). Notably, knockdown of BET suppressed the expression of 
Tnf that was resistant to I-BET (Supplementary Figs 6 and 7). This 
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result points to the existence of BET-recruiting mechanisms that are 
independent of BET interaction with acetylated histones. The existence 
of such a mechanism is supported by findings that show recruitment of 
BET to acetylated Rela or mediator complex’*’”'*. Certain genes were 
upregulated by I-BET treatment but none of these have a well- 
established role in inflammation (Supplementary Table 2). The upre- 
gulation of Brd2 and histone-encoding genes (Supplementary Table 2) 
may reflect the existence of a positive feedback mechanism where 
suppression of BET leads to a compensatory increase in the expression 
of chromatin proteins. The activating effect of I-BET on gene expres- 
sion may also reflect the ability of BET to function not only as tran- 
scriptional co-activators but also as co-repressors””. 

The genome-wide analysis of the epigenetic states of LPS-inducible 
genes that were significantly suppressed or not affected by I-BET 
(sI-BET and nal-BET genes, respectively) provided a clue for the selec- 
tive effect of I-BET on gene expression. Elevated basal levels of histone 
H3 and H4 acetylation (H3ac and H4ac) at the nal-BET gene promoters 
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Figure 1 | I-BET isa selective antagonist of BET proteins. a, Chemical 
structure of GSK525762A (I-BET). b, Structure of I-BET (orange) bound to the 
acetyl-binding pocket of BRD4-BD1 overlaid with acetylated histone H4 
peptide (H4ac, green). The “WPF shelf (W81, P82, F83) as well as the 
asparagine N140 essential for acetylated lysine (Kac) binding are indicated. 

c, I-BET binds with high affinity to BET proteins as determined by isothermal 
titration calorimetry (ITC) of tandem bromodomain fragments of BRD2 (1- 
473), BRD3 (1-434), BRD4 (1-477) interaction with I-BET or BRD4 (1-477) 
interaction with an inactive enantiomer of I-BET (inactive I-BET). Time 
courses of raw injection heats (upper panel) and normalized binding 
enthalpies, calculated using a single site binding model (Origin software, 
Microcal, lower panel), are shown. d, I-BET competes with H4ac peptide for 
bromodomain binding. Displacement of tetra-acetylated histone H4 peptide 
from bromodomains of BRD2 (blue), BRD3 (black) and BRD4 (red) by I-BET 
was determined by FRET analysis. 


1120 | NATURE | VOL 468 | 23/30 DECEMBER 2010 


indicated that naI-BET genes were already primed or actively involved 
in transcription (Fig. 3 and Supplementary Fig. 8). Indeed, the nal-BET 
gene promoters were associated with higher basal levels of H3K4me3 
and RNA polymerase (Pol) II, including the elongation competent 
RNA Pol II, phosphorylated at serine 2 (RNA Pol II S2; Fig. 3). The 
important role of the primed/active state in defining the sensitivity to 
I-BET was underscored by the lack of I-BET effect on expression of 
housekeeping genes such as Gapdh, Tubb5 and Hprt (Supplementary 
Fig. 4) that are characterized by high levels of H3ac, H4ac, H3K4me3 
and RNA Pol II at their promoters’. Furthermore, an increase in overall 
histone acetylation levels caused by BMDM treatment with the histone 
deacetylase (HDAC) inhibitor trichostatin A (TSA) was able to ‘con- 
vert’ sI-BET into nal-BET genes (Supplementary Fig. 9). 

The primed and/or active transcription state of naI-BET genes 
before LPS stimulation was accomplished without recruitment of sig- 
nificant amounts of BET, thus reducing the likelihood of suppression 
of these genes by I-BET (Fig. 3). Furthermore, treatment with I-BET 
has less impact on BET association with nal-BET compared to sI-BET 
gene promoters in LPS-treated cells (Fig. 3). The mechanism of this 
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Figure 2 | I-BET suppresses a specific subset of LPS-inducible genes. 

a, Venn diagrams display the number of LPS-inducible (>twofold, red circles) 
genes that were suppressed (>twofold, green circles) or upregulated 
(>twofold, yellow circles) by I-BET (1 LM) treatment at 1 or 4h after LPS 
stimulation (left and right panels) of BMDMs. b, Heat map representation of 
expression levels of genes that were downregulated by I-BET at 1h (left panel) 
and 4h (right panel) after LPS stimulation of three independent macrophage 
cultures. Scale ranges from a signal value of 2° (64, green) to 214 (16,384, red). 
Fold-change values are listed. Table shows the distribution of downregulated 
genes into functional categories. 
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Figure 3 | Epigenetic profiles of genes suppressed or unaffected by I-BET in 
LPS-stimulated macrophages. a, Genome-wide epigenetic profiles of sI-BET 
or nal-BET genes in unstimulated or LPS-stimulated (1h) macrophages pre- 
treated with 5 1M of I-BET ora DMSO control. Analysed epigenetic marks are 
indicated. y-Axes represent the number of reads per million mapped reads. 

b, Epigenetic profiles of 1/6 and Tnf. The y-axes represent the average number of 
tags per gene per 25 base pairs per 1,000,000 mapped reads. Scale values are 


phenomenon may reflect higher LPS-induced H3ac and H4ac levels at 
nal-BET compared to sI-BET gene promoters before and after I-BET 
treatment (Fig. 3 and Supplementary Fig. 8). Additionally, some of the 
nal-BET genes may recruit BET via histone acetylation-independent 
mechanisms’. 

Treatment of BMDMs with I-BET affected not only the promoter- 
bound BET but also the levels of H3ac, H4K5ac, H4K8ac, H4K12ac 
and total H4ac on LPS-induced gene promoters (Fig. 3 and Sup- 
plementary Fig. 8). The mechanism of the negative impact of I-BET 
on histone H3 and H4 acetylation might be twofold. First, by preventing 
BET from binding to H3ac/H4ac, I-BET increases the accessibility of 
exposed H3ac/H4ac to HDACs. This model is supported by a non- 
enzymatic role of BRD4 in H4ac preservation in embryonic stem 
(ES) cells*®. It is also possible that I-BET binding to BET prevents the 
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indicated in parentheses. c, The abundance of epigenetic marks on I/6 and Tnf 
gene promoters was quantified by ChIP qPCR from four (BRD3, BRD4, Pol II 
and Pol II S2) or two (BRD2 and P-TEFb) independent experiments performed 
in triplicate. Error bars are s.e.m. of independent experiments or s.d. of 
representative experiments, respectively. Asterisks indicate P < 0.05 as 
determined by an unpaired t-test. 


formation of multi-molecular complexes that contain histone acetyl- 
transferases (HATs), other histone modifying enzymes, including 
lysine H3K4me3 methyltransferases, as well as the positive transcrip- 
tional elongation factor b (P-TEFb) and RNA Pol II’*?*’. This model 
is consistent with diminished levels of P-TEFb, that contributes to 
mRNA elongation by RNA Pol II phosphorylation*”’”, and reduced 
amounts of H3K4me3 and RNA Pol II at sI-BET genes (Fig. 3). The 
possible direct impact of I-BET on H3ac/H4ac through inhibition of 
bromodomain-containing HATs was excluded by the inability of 
I-BET to suppress the activity of the most common HATs such as 
pCAF, p300, GCN5 and CBP (also known as Kat2b, Ep300, Kat2a 
and Crebbp, respectively; Supplementary Fig. 10). 

The features of sI-BET and nal-BET genes assessed by the genome- 
wide analysis were mirrored by the epigenetic states of selected sI-BET 
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Figure 4 | I-BET suppresses inflammation in vivo. a, Kaplan-Meier survival 
curves of: LPS-treated C57BL/6 mice (5 mg per kg, i-p., n = 12 per group) that 
were injected i.v. with a solvent control (black squares) or 30 mg per kg of I-BET 
1h before (blue triangles) or 1.5h after (blue circles) LPS administration (left 
panel); mice injected iv. with heat-killed Salmonella typhimurium, strain IR71 
(5 X 10” per kg, n = 10 per group) (middle panel); or mice subjected to caecal 
ligation puncture (CLP) procedure that were administered a solvent control or 
30 mg per kg of I-BET twice a day for 2 days (n = 8 per group) (right panel). 
b, Serum titres of indicated cytokines were measured by ELISA (n = 10 per 
group). Mice received a solvent control (black squares) or I-BET (blue triangles) 
1h before LPS injection and samples were collected at 2 h after LPS treatment. 
**EP < 0.001, **P < 0.01, *P < 0.05 as determined by unpaired t-test. 


and naI-BET genes. Following I-BET treatment, the promoter of the 
siBET gene I/6 showed a marked reduction in BET recruitment and 
diminished levels of associated H3K4me3, P-TEFb, RNA Pol II and 
RNA Pol II 82 (Fig. 3b, c). In contrast to I/6, the naIl-BET gene Tnf 
showed higher accumulation of BRD2, BRD3 and to a lesser extent 
BRD4, around the transcriptional start site (TSS) following I-BET 
treatment (Fig. 3b, c). The relatively higher BET levels at the Tnflocus 
were associated with largely unaffected levels of P-TEFb, RNA Pol II 
and RNA Pol II S2 (Fig. 3b, c). In support of distinct epigenetic states 
between sI-BET and nal-BET genes, the sI-BET gene Il1b had reduced 
BET accumulation at its TSS that resulted in a drop of P-TEFb, Pol I 
and Pol II S82 levels. In contrast, the epigenetic landscape of the nal- 
BET gene Nfkbia displayed little change in response to I-BET (Sup- 
plementary Fig. 11). 

The selectivity of gene responses to I-BET correlated inversely with 
the timing of LPS-induced gene activation. Opposite to early stimulated 
(primary response) nal-BET genes, the majority of sI-BET genes, with 
the exception of I11b, belong to the category of secondary response 
genes (SRG) that become upregulated at later points of macrophage 
activation (Supplementary Fig. 12a, c). Most of the sI-BET genes, as well 
as II1b, were characterized by low basal levels of H3ac/H4ac, H3K4me3, 
RNA Pol II, as well as low CpG content of their promoters (Fig. 3, 
Supplementary Figs 8 and 12b). The latter feature conveys higher 
stability to promoter-associated nucleosomes that generates a selective 
barrier for transcriptional activation of secondary response genes”*™*. It 
is likely that suppression of BET recruitment as well as reduction in 
H3ac/H4ac and H3K4me3 by I-BET aggravates the already non- 
permissive transcriptional state of the sI-BET genes and reduces the 
probability of their expression, thus defining the selectivity of I-BET. 

The suppression of key inflammatory genes by I-BET suggested a 
potent ability of the compound to treat inflammatory conditions in 
vivo. The serum titres of intravenously (i.v.) administered I-BET 
remain within the effective concentrations for several hours after injec- 
tion (Supplementary Fig. 13). Injection of I-BET in mice before the 
initiation of LPS- or heat-killed Salmonella typhimurium-induced 
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endotoxic shock was able to prevent or attenuate death of mice 
(Fig. 4a left and middle panel). Most promisingly for therapeutic 
applications, a single dose of I-BET applied at 1.5 h after LPS injection, 
at the time when mice started to develop symptoms of inflammatory 
disease, cured the mice (Fig. 4a, left panel). Furthermore, in mice that 
suffer from polymicrobial peritonitis and sepsis caused by caecal liga- 
tion and puncture (CLP), twice-daily injections of I-BET for 2 days 
protected mice against death caused by sepsis (Fig. 4a, right panel). 

The marked therapeutic effect of I-BET on endotoxic shock and 
sepsis occurred despite unaltered serum TNF levels (Fig. 4b). As TNF 
is an established mediator of sepsis-associated inflammatory pro- 
cesses, the protective effect of I-BET on sepsis suggests the ability of 
I-BET to interfere not only with the expression of inflammatory 
proteins (Fig. 4b), but also with TNF-inducible gene expression. 
Indeed, treatment of BMDMs with I-BET suppressed TNF-inducible 
key pro-inflammatory cytokine (Il1b, Illa) and chemokine genes 
(Ccl5, Cxcl10, Cxcl2/3) as well as vasoactive and lipid-related genes 
(Pdgfb, Adora2b, Fabp3) that contribute to sepsis pathogenesis (Sup- 
plementary Fig. 14a, b). Notably, similar to the sI-BET genes in LPS- 
treated BMDMs, the majority of sI-BET genes in TNF-treated cells fit 
into the secondary response gene category as assessed by epigenetic 
modifications and CpG content (Supplementary Fig. 14c). 

In summary, we show the anti-inflammatory potential of the 
synthetic compound I-BET that, by interfering with binding of 
bromodomain-containing BET proteins to acetylated histones, dis- 
rupts the formation of the chromatin complexes essential for expres- 
sion of inflammatory genes. The genes susceptible to I-BET share a 
common pattern of chromatin modifications at their promoters as well 
as low promoter CpG content. Suppression of inflammation by I-BET 
demonstrates the potential of drugs that interfere with protein binding 
to post-translationally modified histones to achieve a high level of 
selectivity and potency by exploiting the inherited epigenetic states 
of genes that contribute to specific physiological and pathological 
processes. 


METHODS SUMMARY 


I-BET is an optimized derivative of benzodiazepine compounds that were iden- 
tified by high-throughput screening of activators of ApoA1-luciferase reporter in 
HepG2? cells as described in Supplementary Information. The chemical synthesis 
of I-BET is described in Supplementary Information. The 1.6 A crystal structure of 
BRD4-BD1 with I-BET was produced by soaking apo crystals in 2mM I-BET for 
4 days. Molecular replacement using 20ss.pdb gave excellent difference density at 
the acetylated binding site that allowed the ligand binding to be unambiguously 
modelled. Methods and statistics for data collection and refined coordinates are 
provided in Supplementary Information and deposited in the RCSB Protein Data 
Bank with PDB ID code 3P50. Bone marrow-derived macrophages (BMDMs) 
were differentiated from a bone marrow cell suspension obtained from C57BL/6 
mice as described in supplementary information. For microarray, qPCR (quanti- 
tative PCR) and ChIP (chromatin immunoprecipitation) analyses, BMDMs were 
pre-incubated with 1 1M or 5 uM of I-BET, DMSO or an inactive I-BET com- 
pound for 30min before LPS (100 ng ml ') or TNF (50 ng ml ') stimulation. 
Microarray experiments were performed using Illumina MouseRef-8 v2.0 express- 
ion BeadChip kits (GEO accession code GSE21764). qPCR was performed using 
SYBR Green (Roche Lightcycler 480). ChIP was performed as described’’ and 
detailed in supplementary information. ChIP sequencing libraries were generated 
as described*® (GEO accession code GSE21910). For LPS-induced endotoxic 
shock, 5 mg per kg of LPS was injected intraperitoneally (i.p.) into age-matched 
C57BL/6 mice. Heat-killed Salmonella typhimurium (IR715; 5 X 10° per kg) was 
injected intravenously. Caecal ligation puncture (CLP) was performed as 
described”. For in vivo experiments I-BET or a solvent control (20% beta-cyclo- 
dextrin, 2% DMSO in 0.9% saline) were given via retro-orbital or tail vein injection 
(CLP) at a dose of 30 mg per kg. 
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Covalent modification of histones is fundamental in orchestrating 
chromatin dynamics and transcription’ *. One example of such an 
epigenetic mark is the mono-ubiquitination of histones, which 
mainly occurs at histone H2A and H2B*°. Ubiquitination of 
histone H2A has been implicated in polycomb-mediated transcrip- 
tional silencing’ °. However, the precise role of the ubiquitin mark 
during silencing is still elusive. Here we show in human cell lines 
that ZRF1 (zuotin-related factor 1) is specifically recruited to 
histone H2A when it is ubiquitinated at Lys 119 by means of a 
novel ubiquitin-interacting domain that is located in the evolutio- 
narily conserved zuotin domain. At the onset of differentiation, 
ZRF1 specifically displaces polycomb-repressive complex 1 (PRC1) 
from chromatin and facilitates transcriptional activation. A 
genome-wide mapping of ZRF1, RINGIB and H2A-ubiquitin 
targets revealed its involvement in the regulation of a large set of 
polycomb target genes, emphasizing the key role ZRF1 has in cell 
fate decisions. We provide here a model of the molecular mech- 
anism of switching polycomb-repressed genes to an active state. 
To identify proteins capable of binding ubiquitinated H2A (H2Aub), 
we developed an affinity purification based on the expression of Flag- 
tagged histone H2A. Among several potential H2Aub-binding proteins 
(Supplementary Fig. 1A, C and Supplementary Table 1), we chose to 
analyse ZRF1 in more depth, as within its carboxy terminus this protein 
contains two SANT domains, which are often found in subunits of 
chromatin-remodelling complexes (Fig. la). Intriguingly, its yeast 
homologue Zuol is linked to the ubiquitination of histone H2B in 
Saccharomyces cerevisae’®. Moreover, ZRF1 has also been implicated 
in cancer and differentiation’. It adopts an oligomeric conformation 
and is located in the nucleus as well as in the cytosol (Supplementary 
Figs 1D, E and 2A). Purification of mononucleosomes from 293T cells 
expressing Flag-tagged histone H2A, either wild type or mutated 
(KKRR) at the ubiquitination sites (K118/K119), revealed ubiquitin- 
specific ZRF1 binding preferentially to the wild-type mononucleo- 
somes (Fig. 1b and Supplementary Fig. 1B, F, H, I). Corroborating this 
finding, we observed specific binding of ubiquitinated wild-type 
nucleosomes to recombinant ZRF1 (Fig. 1c). Thus, these data point 
to the ubiquitin mark at histone H2A as a docking site for ZRF1. 
ZRF1 shares homology in the zuotin domain with its yeast orthologue 
Zuol (Fig. 1a), which is synthetically lethal with Rad6, the E2 enzyme 
involved in the specific ubiquitination of histone H2B'°. We reasoned 
that the conserved zuotin domain might contain the ubiquitin-binding 
motif'*. Results from pull-down experiments with a GST-ubiquitin 
fusion protein and different recombinant ZRF1 truncation proteins 
allowed us to map the ubiquitin-binding domain (UBD) to a region 
C-terminal of the DnaJ domain (Fig. 1d). H2A ubiquitination as well as 
histone H3K27me3 are marks typically located in promoter regions of 
polycomb-silenced genes’*"*. To test for ubiquitin-dependent recruit- 
ment of ZRF1 to chromatin, we established NT2 knockdown cell lines 
for ZRF1 or RINGIB (a PRC1 subunit that is an E3 ligase; Fig. le). We 


then analysed occupancy at several promoter regions of polycomb- 
repressed genes, including PERI, NF1C (Fig. 1f) and the well-characterized 
HOX genes'*®. ZRF1 enrichment at the promoters clearly depended 
on the abundance of RINGIB and on H2Aub levels (Fig. 1g, h and 
Supplementary Fig. 1G). 

It has been shown that PRC1 is tethered to chromatin by the inter- 
action of its subunit PC1 with a trimethyl mark on Lys 27 of histone H3 
(H3K27me3)*"*. Using purified mononucleosomes containing either 
wild-type H2A or the H2A(KKRR) mutant, we observed that co- 
purification of the PRC1 subunits RING1B and BMI1 depended on 
the ubiquitination of histone H2A (Fig. 2a). In contrast, we did not find 
an alteration of the H3K27 methylation levels in nucleosomes devoid 
of the ubiquitin mark, indicating that stable maintenance of PRC1 at 
chromatin depends on the ubiquitin mark (Fig. 2a and Supplementary 
Fig. 2J). To understand the functional relationship between ZRF1 and 
PRC1, we characterized further the binding affinity of RINGIB 
towards the ubiquitin residue by GST pull-down experiments (Sup- 
plementary Fig. 2B). Furthermore, after reconstituting RINGI1B- 
containing mononucleosome complexes, RINGIB was efficiently 
released from nucleosomes following incubation with GST-ubiquitin 
(Fig. 2b and Supplementary Fig. 2C). This finding indicated that ZRF1 
could compete with RINGIB for binding at H2Aub. Indeed, ZRF1 
overexpression led to displacement of the PRC1 subunits RING1B 
and BMI1 from chromatin, whereas ZRF1 knockdown led to an 
enhanced occupancy of RING1B at chromatin that caused an increase 
in H2A ubiquitination (Fig. 2c, d and Supplementary Fig. 2D-H). We 
next performed competition assays with the GST-ubiquitin substrate. 
When the His-tagged RINGIB concentration was maintained, we 
observed that increasing the His-ZRF1 concentration led to a reduc- 
tion of RING1B bound to the ubiquitin substrate, emphasizing the 
competition for the ubiquitin residue by both proteins (Fig. 2e). We 
then assembled recombinant RINGIB-GST-ubiquitin complexes and 
performed pull-down experiments after adding either bovine serum 
albumin (BSA) alone (lane 1) or recombinant His-UBD and BSA 
(lanes 2 and 3). In concordance with the previous result, we observed 
RINGIB replaced by the UBD of ZRF1 (Fig. 2f). Similarly, on recon- 
stituted RING1B-mononucleosome complexes, ZRF1 efficiently dis- 
placed RING1B (Fig. 2g and Supplementary Fig. 21). Finally, chromatin 
immunoprecipitation (ChIP) experiments in 293T cells overexpressing 
either ZRF1 or only the UBD, indicated an enrichment of ZRF1 or the 
UBD at promoters of the HOX gene cluster concomitantly with the 
displacement of the PRC1 subunits RING1B and BMII (Fig. 2h-i and 
Supplementary Fig. 2F). In contrast, neither a ZRF1 deletion mutant 
devoid of the UBD nor the yeast homologue Zuol, which shows only a 
weak ubiquitin-binding capacity, were recruited to chromatin or were 
able to displace PRC1 (Supplementary Fig. 3A-C). It has been shown 
that depletion of RING1B, and thus H2A ubiquitination, leads to the 
loss of PRC2 from chromatin’’. In agreement with this previous study, 
we found that PRC2 levels were reduced at KKRR mutant nucleosomes. 
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Figure 1 | ZRF1 interacts with H2Aub. a, Schematic diagram of ZRF1 
orthologues indicating the DnaJ domain and SANT domains. The numbers 
along the right-hand side of panels a and d refer to the number of amino acids 
each of the proteins is composed of. b, Flag-tagged histone H2A and 
H2A(KKRR) were expressed in 293T cells. Mononucleosomes were purified 
and eluates were subjected to immunoblot analysis using ZRF1 and Flag 
antibodies. The inputs correspond to 3%. ¢, Nuclear protein extracts containing 
mononucleosomes were incubated with recombinant His-ZRF1. Precipitated 
ZRF 1-nucleosome complexes were subjected to immunoblot analysis using the 
indicated antibodies. The inputs represent 5% of His-ZRF1 and 2% of the 


Similarly, PRC2 levels decreased upon binding of ZRF1 to chromatin 
(Supplementary Fig. 4A—C). 

To globally identify ZRF1 target genes, we performed a ChIP-on- 
chip (see Methods) analysis in NT2 cells'*'’. Because our data indicate 
that ZRF1 might antagonize silencing by polycomb proteins, we 
designed an experiment that allowed us to study the occupancy of 
ZRF1 under conditions of retinoic-acid-induced differentiation 
(Fig. 3a). We found ZRF1 to be enriched in 758 (not induced), 2,295 
(induced for 1h) or 995 (induced for 3h) genes (Fig. 3b and 
Supplementary Table 2). Analysis of the ZRF1 occupancy at its target 
genes revealed a marked increase at 1h of induction (Fig. 3b, 
Supplementary Fig. 5C and Supplementary Table 2). Clustering the 
target genes with respect to their cellular functions indicates a role for 
ZRF1 in developmental processes and differentiation (Fig. 3c, d and 
Supplementary Fig. 5A, B). Additional ChIP-on-chip analysis indi- 
cates that RINGIB and H2Aub target genes are mainly involved in 
developmental processes (Supplementary Figs 7A-L, 8A-J and 
Supplementary Tables 4 and 5), as shown in previous publications’. 
The overlap of ZRF1 targets (1h retinoic acid) with RINGIB and 
H2Aub targets led to the identification of 1,102 common target genes 
(Fig. 3e, f). Moreover, comparison of ZRF1 target genes with polycomb 
target genes” indicates that ZRF1 is more closely linked to PRC1 than 
to PRC2 (Supplementary Figs 6A, B, 8K, L). We next performed a gene 
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protein extract. d, GST pull-downs with GST, GST-ubiquitin (GST-ubi) and 
GST-RINGIB (right panel) and the His-tagged proteins indicated. Bound 
material was subjected to immunoblot analysis using His and ZRF1 antibodies. 
The input shown represents 2%. WT, wild type. e, Protein extracts of RING1B 
and ZRF1 knockdown cell lines were subjected to immunoblotting and probed 
with the antibodies indicated in the figure. f, ChIP experiments performed in 
NT2 control cells with RING1B antibodies. g, h, ChIP experiments performed 
in the NT2 control and knockdown cells with ZRF1 and H2Aub antibodies. The 
occupancy at promoters of the PERI, NF1C and HOXA4 genes was tested by 
quantitative PCR. Data are represented as mean + s.e.m. (m = 3). 


expression analysis comparing short hairpin RNA targeting ZRF 
(shZRF1) with shControl (non-specific shRNA constructs) cells, with 
or without retinoic-acid treatment. Interestingly, downregulated genes 
in shZRF1 after retinoic-acid stimulation are ZRF1 or polycomb targets, 
particularly for PRC1 and H2Aub (Supplementary Figs 6C, 9A-—G and 
Supplementary Table 6). Among these genes more than a hundred are 
targeted by ZRF1, RINGIB and H2Aub and many of these are major 
players in developmental pathways (Fig. 4a, b). To corroborate our 
findings, we performed ChIP experiments and gene expression analysis 
on selected ZRF1 target genes. We found that ZRF1 was significantly 
enriched at these genes only after stimulation with retinoic acid (Fig. 4c 
and Supplementary Fig. 10A). Under the same conditions, we observed 
transcriptional activation of the same genes in wild-type NT2 cells. 
However, in ZRF1 knockdown cells, we detected a decrease of the 
messenger RNA levels (Fig. 4d and Supplementary Fig. 10B). In sum, 
the data presented show a clear involvement of ZRF1 in the PRC1 
pathway and, most importantly, that activation of genes targeted by 
PRC1 and H2Aub is facilitated by ZRF1. 

Several polycomb target genes become activated during differenti- 
ation, concomitantly with the disappearance of the polycomb-dependent 
repressive marks'*'**°**, Analysis of two HOXA genes revealed that 
retinoic-acid-induced transcriptional activation depended on the pres- 
ence of ZRF1. In contrast, RING1B knockdown caused a more robust 
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Figure 2 | ZRF1 and PRC1 compete for binding of H2Aub. 

a, Mononucleosomes were purified from 293T cells expressing Flag-tagged 
H2A or a double mutant (KKRR). The purified material was subjected to 
immunoblot analysis using the indicated antibodies. The inputs represent 3%. 
b, Nucleosome-His—RING1B complexes were assembled, washed and 
incubated with GST (70 ng tl_') or GST-ubiquitin (70 ng yl’). Flag eluates 
were subjected to immunoblot analysis with the indicated antibodies. 

c, Chromatin association assay of 293T cells overexpressing ZRF1. Immunoblot 
analysis was performed with the indicated antibodies. d, Immunoblot analysis 
of 293T cells overexpressing ZRF1 using the indicated antibodies. e, GST- 
ubiquitin was incubated with constant amounts of His-RING1B and increasing 
amounts of His—ZRF1 finally reaching equimolar levels (last lane). The inputs 
show 10% of His-RING1B and 10% of the maximal amount of His—ZRF1. 


activation of those genes, thus supporting opposing roles for PRC1 and 
ZRF1 in transcriptional regulation of promoters (Fig. 4e). Next we inves- 
tigated the occupancy of both ZRF1 and RINGIB at promoters of HOX 
genes during retinoic-acid-induced transcription. Retinoic-acid treat- 
ment led to the recruitment of ZRF1 to promoter regions with a con- 
comitant reduction of RING1B occupancy, clearly indicating mutually 
exclusive binding for these proteins at chromatin (Fig. 4f g). 
Accordingly, in ZRF1 knockdown cells, RINGIB was not efficiently 
removed from chromatin after retinoic-acid induction (Fig. 4h), as sup- 
ported by previous experiments (Fig. 2a—h). In related experiments (1h 
retinoic acid) we found H2Aub to be slightly reduced at HOXA gene 
promoters, indicating a deletion of this histone mark shortly after the 
removal of PRC1 complexes (Supplementary Fig. 11A-C). A set of 
similar results was obtained in retinoic-acid-induced differentiation 
of leukaemic cells (Supplementary Fig. 10C-E)**. On the basis of our 
results, we reasoned that ZRF1 might facilitate transcription. Recently, 
it has been shown that USP21-mediated H2A deubiquitination pre- 
cedes gene activation’’. To investigate further the impact of ZRF1 on 
transcriptional activation, we performed in vitro experiments testing 
whether ZRF1 might act in concert with specific deubiquitinases. In 
vitro deubiquitination assays carried out with mouse liver chromatin 
demonstrate that ZRF1 facilitates H2A deubiquitination (Fig. 4i). 
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f, GST and GST-ubiquitin were incubated with RING1B, washed and 
incubated with His~-ZRF1-UBD (see Methods). The retained material was 
subjected to immunoblot analysis with His antibodies. Lane 1 shows the pull- 
down in the presence of only BSA, lanes 2 and 3 in the presence of both BSA and 
His-ZRF1-UBD. g, Nucleosome-His-RING1B complexes were assembled and 
incubated with GST (100 ng yl’) or ZRF1 (100 ng pl '). After elution by Flag 
peptide, immunoblot analysis was performed with Flag, RING1B and ZRF1 
antibodies. h, ChIP experiments with ZRF1, RING1B and BMI] antibodies 
after overexpression of ZRF1 in 293T cells. i, Experiments were performed as 
already stated with the exception that the Flag~UBD was overexpressed instead 
of the full-length ZRF1. The occupancy at promoters of HOX genes was tested 
with quantitative PCR. Data are represented as mean + s.e.m. (n = 3). 


Thus, these results showed that, besides its function in the displace- 
ment of PRC1 complexes, ZRF1 facilitates transcription by cooperat- 
ing with deubiquitinase enzymes. 

Ubiquitination of H2A has long been correlated with activation of 
genes”®. It is intriguing that ubiquitination of histone H2A not only has 
an effect on gene silencing but also is necessary to attract a factor that 
switches genes from a silenced to a transcriptionally activated state. 
However, it is still unclear how ZRF1 binding to chromatin is regulated 
(Supplementary Fig. 12A, B). One potential mode of regulating ZRF1, 
and thus cell differentiation, could be to mask its UBD domain. It has 
been shown that proteins of the ID (inhibitor of differentiation) family 
bind to ZRF1 in a region spanning its UBD domain’’ (Supplementary 
Fig. 12C). Our data indicate that association of PRC1 with chromatin 
depends on the H2Aub mark, whereas H3K27me3 is not sufficient to 
retain PRC1 complexes and is most probably required for its initial 
targeting’’*’. RING1B/PRC1 are not as abundant as H2Aub, thus 
excluding a continuous binding of PRC1 complexes throughout chro- 
matin. Yet it has been shown that during DNA damage H2A E3 ligases 
bind ubiquitinated H2A and propagate the initial chromatin ubiqui- 
tination marks”. A similar sliding mechanism could also apply to our 
findings regarding RING1B, and challenge the current view of ubiqui- 
tination and deubiquitination cycles (see also Supplementary 
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Figure 3 | Genome-wide mapping of ZRF1 target genes in NT2 cells. 

a, Schematic representation of the experimental approach for the ChIP-on-chip 
experiment. Chromatin was subjected to triplicate ChIP experiments with 
ZRF1 and control antibodies. The obtained material was amplified and 
hybridized with Human Promoter Arrays chips from Agilent. b, Venn diagram 
of the ZRF1 target genes as obtained by Chipper analysis. c, Functional 
enrichment analysis of ZRF1 target genes at the different time points of 
retinoic-acid (RA) induction. d, A selection of ZRF1 target genes identified in 
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this study (induced for 1 h), focusing on those known to be involved in key 
pathways controlling cell fate decisions. e, Venn diagram showing significant 
overlapping between the gene lists of RING1B, H2Aub and ZRF1 (induced for 
1h) as obtained by ChIP-on-chip analysis. The P values after overlapping the 
H2Aub target genes with ZRF1 and/or RINGIB targets are listed in the 
following: RING1B (P= 10° '°), ZRF1 (1h; P= 10 '?) and RING1B-ZRF1 co- 
targets (P= 10 7°). f, Functional enrichment analysis of the 1,102 common 
ZRF1/RING1B/H2Aub target genes. 
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Figure 4 | ZRF1 functions in activating polycomb-repressed genes. a, The 
list of genes significantly repressed in comparison to shControl cells after 
stimulation with retinoic acid was overlapped with the common ZRF1/ 
RING1B/H2Aub target genes (see also Supplementary Fig. 9). b, Functional 
enrichment analysis of the 104 common target genes downregulated in shZRF1 
cells. c, ChIP experiments were performed with ZRF1 antibodies and 
chromatin obtained from NT2 induced with retinoic acid (THRA, FGF9, RARB 
and RARA: th retinoic acid; PERI and USP20: 3h retinoic acid). The 
occupancy at promoters of the aforementioned genes was tested by quantitative 
PCR. Data are represented as mean = s.e.m. (n = 3). d, The mRNA levels of the 
genes indicated were measured in NT2 shZRF1 and shControl cell lines after 


supplementing with retinoic acid for the respective times (THRA, FGF9, PERI 
and USP20: 3h retinoic acid; RARA and RARB: 2h retinoic acid). Data are 
represented as mean = s.e.m. (m = 3). e, shControl, shZRF1 and shRINGIB 
NT2 cells were induced for 1h with 10 °M of retinoic acid. RNA levels of the 
HOXA1 and HOXA2 mRNA were measured in relation to mRNA levels of the 
ribosomal gene PUM1 (n = 3). f-h, shControl NT2 cells or shZRF1 
knockdown cells were kept under the same conditions as in e, and chromatin 
was used in ChIP experiments with RINGIB and ZRF1 antibodies. Data are 
represented as mean = s.e.m. (n = 3). i, Mouse liver chromatin was incubated 
with ZRF1 and USP21 (10 or 20 ng) as indicated. The H2Aub levels were 
quantified after detection with specific antibodies. 
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Discussion). However, future research will have to reveal the dynamics 
of PRC1-catalysed ubiquitination. 


METHODS SUMMARY 


Experiments were performed using human cell lines (NT2, 293T and U937) and 
affinity-purified or commercially available antibodies. The knockdown cells used 
were established by retroviral infection. ChIP experiments, mutagenesis of histone 
H2A, genome-wide studies and protein purification procedures are explained in 
Methods. 


Full Methods and any associated references are available in the online version of 
the paper at www.nature.com/nature. 


Received 12 June 2009; accepted 12 October 2010. 


1. Strahl, B. D. & Allis, C. D. The language of covalent histone modifications. Nature 
403, 41-45 (2000). 

2. Li, B., Carey, M. & Workman, J. L. The role of chromatin during transcription. Cell 
128, 707-719 (2007). 

3. Kouzarides, T. Chromatin modifications and their function. Cel/ 128, 693-705 
(2007). 

4. Goldknopf, |. L. etal. Isolation and characterization of protein A24, a ‘‘histone-like” 
non-histone chromosomal protein. J. Biol. Chem. 250, 7182-7187 (1975). 

5. Zhu, B. etal. Monoubiquitination of human histone H2B: the factors involved and 
their roles in HOX gene regulation. Mol. Cel! 20, 601-611 (2005). 

6. Nickel, B. E., Allis, C. D. & Davie, J. R. Ubiquitinated histone H2B is preferentially 
located in transcriptionally active chromatin. Biochemistry 28, 958-963 (1989). 

7. Wang, H. et al. Role of histone H2A ubiquitination in Polycomb silencing. Nature 
431, 873-878 (2004). 

8. Kuzmicheyv, A., Nishioka, K., Erdjument-Bromage, H., Tempst, P. & Reinberg, D. 
Histone methyltransferase activity associated with ahuman multiprotein complex 
containing the Enhancer of Zeste protein. Genes Dev. 16, 2893-2905 (2002). 

9. Zhou, W. et al. Histone H2A monoubiquitination represses transcription by 
inhibiting RNA polymerase II transcriptional elongation. Mol. Cell 29, 69-80 
(2008). 

10. Pan, X. et a/. A DNA integrity network in the yeast Saccharomyces cerevisiae. Cell 
124, 1069-1081 (2006). 

11. Greiner, J. etal. Characterization of several leukemia-associated antigens inducing 
humoral immune responses in acute and chronic myeloid leukemia. /nt. J. Cancer 
106, 224-231 (2003). 

12. Resto, V.A. etal. A putative oncogenic role for MPP11 in head and neck squamous 
cell cancer. Cancer Res. 60, 5529-5535 (2000). 

13. Inoue, T., Shoji, W. & Obinata, M. MIDA1, an Id-associating protein, has two distinct 
DNA binding activities that are converted by the association with Id1: a novel 
function of Id protein. Biochem. Biophys. Res. Commun. 266, 147-151 (1999). 

14. Hicke, L, Schubert, H. L. & Hill, C. P. Ubiquitin-binding domains. Nature Rev. Mol. 
Cell Biol. 6, 610-621 (2005). 

15. Bracken, A. P., Dietrich, N., Pasini, D., Hansen, K. H. & Helin, K. Genome-wide 
mapping of Polycomb target genes unravels their roles in cell fate transitions. 
Genes Dev. 20, 1123-1136 (2006). 

16. Lee, M.G.eta/l. Demethylation of H3K27 regulates polycomb recruitment and H2A 
ubiquitination. Science 318, 447-450 (2007). 


1128 | NATURE | VOL 468 | 23/30 DECEMBER 2010 


17. Saito,S. etal. Haptoglobin-B chain defined by monoclonal antibody RM2 asa novel 
serum marker for prostate cancer. Int. J. Cancer 123, 633-640 (2008). 

18. Andrews, P. W. Retinoic acid induces neuronal differentiation of a cloned human 
embryonal carcinoma cell line in vitro. Dev. Biol. 103, 285-293 (1984). 

19. Andrews, P. W. et a/. Pluripotent embryonal carcinoma clones derived from the 
human teratocarcinoma cell line Tera-2. Differentiation in vivo and in vitro. Lab. 
Invest. 50, 147-162 (1984). 

20. Boyer,L.A.eta/. Polycomb complexes repress developmental regulators in murine 
embryonic stem cells. Nature 441, 349-353 (2006). 

21. Kallin, E. M. et al. Genome-wide uH2A localization analysis highlights Bmi1- 
dependent deposition of the mark at repressed genes. PLoS Genet. 5, e1000506 
(2009). 

22. O’Geen,H. etal. Genome-wide analysis of KAP 1 binding suggests autoregulation of 
KRAB-ZNFs. PLoS Genet. 3, e89 (2007). 

23. Pasini, D., Bracken, A. P., Hansen, J. B., Capillo, M. & Helin, K. The polycomb group 

protein Suz12 is required for embryonic stem cell differentiation. Mol. Cell. Biol. 27, 

3769-3779 (2007). 

24. Villa, R. et al. Role of the polycomb repressive complex 2 in acute promyelocytic 

eukemia. Cancer Cell 11, 513-525 (2007). 

25. Nakagawa, T. et al. Deubiquitylation of histone H2A activates transcriptional 

initiation via trans-histone cross-talk with H3K4 di- and trimethylation. Genes Dev. 

22, 37-49 (2008). 

26. Levinger, L. & Varshavsky, A. Selective arrangement of ubiquitinated and D1 

protein-containing nucleosomes within the Drosophila genome. Cell 28, 375-385 

(1982). 

27. Lagarou, A. et al. (ADM2 couples histone H2A ubiquitylation to histone H3 

demethylation during Polycomb group silencing. Genes Dev. 22, 2799-2810 

(2008). 

28. Doil, C. et al. RNF168 binds and amplifies ubiquitin conjugates on damaged 
chromosomes to allow accumulation of repair proteins. Ce// 136, 435-446 
(2009). 


Supplementary Information is linked to the online version of the paper at 
www.nature.com/nature. 


Acknowledgements We are indebted to S. Jentsch, S. Berger, K. Helin, J. Hasskarl, 

R. Shiekattar, T. Zimmermann and V. Raker for antibodies and plasmids and for 
discussions; and to the CRG Microarray facility and Light Microscopy Facility. This work 
was supported by the Spanish “Ministerio de Educacion y Ciencia” (BFU2007-63059), 
the Association for International Cancer Research (10-0177), by the AGAUR and 
Consolider to L.D.C., and by FOR967 to S.R.; H.R. was supported by a FEBS fellowship; 
J.R. by a fellowship from Fundagao para a Ciéncia e Tecnologia; L.R.-V. by a Juan dela 
Cierva Fellowship; S.D. by a PFIS fellowship. 


Author Contributions H.R. cloned, purified proteins and performed biochemical 
studies. H.R., L.R.-V., J.D.R. and S.D. performed ChIP analysis. G.G. and N.L-B. 
performed genome-wide analysis. T.N. and T.|. performed in vitro transcription and 
deubiquitination experiments. S.R. provided essential tools. H.R. and L.D.C. designed 
the experiments, supervised the project and wrote the manuscript. All authors 
commented on the manuscript. 


Author Information Reprints and permissions information is available at 
www.nature.com/reprints. The authors declare no competing financial interests. 
Readers are welcome to comment on the online version of this article at 
www.nature.com/nature. Correspondence and requests for materials should be 
addressed to L.D.C. (luciano.dicroce@crg.es). 


©2010 Macmillan Publishers Limited. All rights reserved 


METHODS 

Plasmids, antibodies and cell lines. Antibodies against ZRF1 and RING1B were 
either previously described”, or raised in rabbits against full-length protein and 
affinity purified. To that end, GST fusion proteins of both proteins were cross- 
linked to glutathione beads and packed into polystyrol mini-columns (Pierce). 
Antisera were repeatedly run over the columns, washed and finally eluted in Tris 
buffer pH 2.5. The affinity-purified antibody was finally set to pH 8.0. For Fig. 1d 
the ZRF1 serum against full-length protein was used to visualize the recombinant 
protein deletion mutants. In all other experiments the antibody purified with 
GST-ZRFIASANT (a ZRF1 protein lacking the C-terminal SANT domains) 
was used. Antibodies against H2Aub, IgM conjugating antibody and H3K4 
trimethyl were obtained from Upstate antibodies. Antibodies against histone 
H2A and the histone modification H3K4 trimethyl were purchased from Abcam. 
Antibodies against the His and Flag epitopes were purchased from Qiagen and 
SIGMA, respectively. Antibodies against EED and SUZ12 were a gift from K. Helin. 
Plasmids for the ectopic expression of Flag-tagged ID proteins were a gift from J. 
Hasskarl. For tagging proteins the pet28 (His tag, Novagen), pCMV2 (Flag, 
Invitrogen) and pGex (GST, Invitrogen) vector series were used. The ZRF1 specific 
sequences GITATCTGATCCAGTGAAA and GATCAAAGCAGCTCATAAA 
were used to synthesize oligonucleotides and cloned into pRetroSuper*’. In the 
case of RINGIB the specific sequences AGAACACCATGACTACAAA and 
TTCTAAAGCTAACCTCACA were cloned into the same vector. Mutagenesis 
of histone H2A was performed using the Quikchange mutagenesis kit 
(Stratagene) on a pCMV2b histone H2A plasmid. Information on the cloning 
and sequences are available upon request. The embryonic carcinoma cell line 
NTERA2 (NT2/D1) and HEK 293T cells were cultured in DMEM medium sup- 
plemented with 10% fetal bovine serum at 37°C and 5% CO. NT2 cells were 
treated with retinoic acid to induce differentiation at the given concentrations for 
the mentioned time intervals. U937 cells were cultured in RPMI medium at 37 °C 
and 5% CO>. 

Purification of recombinant proteins. Proteins were purified as suggested by 
Qiagen (His-tagged proteins) and Amersham (GST-tagged proteins) after indu- 
cing BL21 bacterial strains transformed with the respective plasmids at an optical 
density of 0.5 with 0.2 mM of isopropyl-B-p-thiogalactoside either for 4 h at 37 °C 
or at 20°C for 14h. 

Purification of ubiquitin-binding proteins. HEK 293T cells were transfected 
with pCMV2b-histone H2A or the corresponding empty vector (Control) and 
after 48 h mononucleosomes were purified by means of the Flag epitope as stated 
in Supplementary Fig. 1A, C. After harvesting by centrifugation, cells were resus- 
pended in buffer A (10 mM HEPES pH 7.9, 1.5mM MgCl, 10mM KCl and 
0.5mM dithiothreitol (DTT), phenylmethylsulphonyl fluoride (PMSF)) and 
homogenized by 10 strokes in a Dounce homogenizer with a B-type pestle. 
After centrifugation, nuclei were resuspended in lysis buffer (137 mM NaCl, 
2.7mM KCl, 10mM NaH2PO,, 2mM KH>POu,, 0.1% Triton X-100, 0.5 mM 
DTT, PMSF) and sonified using a Diagenode Bioruptor to obtain mononucleo- 
somes (4 °C, 4 cycles of 15 min, ‘H’ setting). Protein extracts were then subjected to 
centrifugation (16,100g, 4 °C, 30 min) to remove debris and incubated with M2- 
Flag Agarose beads. The bound material or the control beads (M2-beads incubated 
with protein extracts from control transfections) were poured in polystyrol mini- 
columns (Flag~H2A column and Control column), washed intensively with lysis 
buffer and then used subsequently in an affinity purification. To this end, a nuclear 
protein extract devoid of histone proteins was prepared from 293T cells as previ- 
ously described*". In brief, nuclei were extracted by resuspension of cells in buffer 
A (10mM HEPES pH 7.9, 1.5mM MgCl, 10 mM KCl and 0.5 mM DTT, PMSF) 
and homogenized by 10 strokes in a Dounce homogenizer with a B-type pestle. 
The crude nuclei were resuspended in buffer C (20 mM HEPES pH 7.9, 25% (v/v) 
glycerol, 1.5 mM MgCh, 420 mM NaCl, 0.2 mM EDTA, 0.5 mM DTT and PMSF) 
and homogenized in a Dounce homogenizer (10 strokes, B-type pestle). The 
resulting protein suspension was stirred by a magnetic stirring bar for 30 min at 
4°C and then centrifuged at 25,000g in an SS34 rotor for 3h. The resulting 
supernatant was dialysed against lysis buffer, and run in a loop over two polystyrol 
mini-columns (Flag—H2A column and Control column; see above). After intensive 
washing with lysis buffer the columns were incubated with a solution of lysis buffer 
with recombinant His-tagged ubiquitin previously purified by Ni-NTA Agarose 
(Qiagen) and gel filtration on a Superose 12 column. After eluting the ubiquitin- 
binding proteins, the columns were washed again in lysis buffer and mononucleo- 
somes were subsequently eluted by a solution of Flag peptide in lysis buffer. Both 
eluates were subjected to electrophoresis, stained with colloidal coomassie, and 
possible interactors were subjected to MALDI-Fingerprint analysis. 
Transfection and retroviral infection. Transfection of HEK 293T cells was usually 
performed by the calcium phosphate co-precipitation method as described”. pRS- 
based retrovirus was produced by transfecting the GP2-293 packaging cell line 
(Clontech). The collected retrovirus was subsequently used to transduce NT2 or 
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293T cell lines by spinoculation at 900g for 90 min at 32°C in the presence of 
protamine sulphate. After incubating overnight at 37 °C the protocol was repeated 
for two consecutive days. 

M2-Flag affinity chromatography. Purification of Flag-tagged proteins from 
293T cells was essentially done as described earlier. All experiments with Flag- 
tagged histone H2A were performed in polystyrene mini-columns (Pierce) with 
subsequent elution using the Flag peptide (Sigma) at a concentration of 100 1g ml 
in PBS. 

ZRF1-H2Aub interaction experiments. Nuclear protein extracts were prepared 
as described earlier to obtain mononucleosomes. The protein extract was then 
incubated with or without recombinant His—ZRF1 (lanes ZRF1 and Control in 
Fig. 1c) for 4h at 4°C. Ni-NTA Agarose was added and after 2 h of incubation at 
4 °C the beads were washed intensively with lysis buffer. The precipitated material 
was then subjected to western blotting. 

Nucleosome-RINGIB complexes and in vitro assays. Mononucleosomes were 
purified as described earlier, but washed with lysis buffer containing 450mM 
NaCL and maintained at the Flag-M2 Agarose beads. The bound nucleosomes 
and empty M2 beads were subsequently incubated with bacterial extracts in lysis 
buffer containing recombinant His-RINGIB. After 2h of incubation at 4°C the 
beads were washed in the same buffer intensively (see Supplementary Fig. 2C). The 
RING1B-nucleosome complexes were then incubated with equal or equimolar 
amounts of either GST or GST-ubiquitin (Fig. 2b) or ZRF1 (Fig. 2g) in lysis buffer. 
After 2h of incubation at 4°C the beads were packed into polystyrole columns, 
washed and eluted with Flag peptide at 100 jg ml’. The eluate was finally sub- 
jected to immunoblotting. 

ChIP. ChIP experiments were essentially performed as described”. For all experi- 
ments affinity-purified antibodies were used as described earlier. The immuno- 
precipitated DNA was quantified by real-time quantitative PCR (Roche 
Lightcycler). The primers for verifying the occupancy of the immunoprecipitated 
protein at chromatin are available upon request. 

Genome-wide mapping of ZRF1 target genes (ChIP-on-chip). Chromatin from 
NT2 cells before (0 h) and after induction with retinoic acid (10 °M) for 1 hor3h 
was subjected to ChIP experiments with ZRF1 and control antibodies. For each 
time-point of the ChIP experiments triplicates were carried out. The obtained 
material was amplified with the WGA kit (Sigma) and linear amplification of 
the material was tested in qPCR reactions with known ZRF1 targets. Labelling 
and hybridization to Agilent Human Promoter Arrays were carried out following 
the supplier’s instructions. Analogously, chromatin from unstimulated NT2 cells 
was subjected to ChIP experiments with RING1B, H2Aub and the respective 
conjugating antibody. The obtained material was processed as described earlier. 
Microarray analysis. Microarray analysis was performed after extracting a trip- 
licate of three different biological samples of RNA from NT2 cells lines (shZRF1 
and shControl) either from non-induced cells or cells induced with retinoic acid 
(10°-°M, 3h). RNA was amplified, labelled and subsequently hybridized to a 
Human Genome Oligo Microarray (Agilent). Raw data were analysed using the 
Limma package. 

Data analysis and statistics. Absolute foreground and background readings from 
channels were used as input to the chipper program. Default parameters were used 
as defined previously’. Chipper calculates q values (corrected P values), thus 
accounting for multiple testing corrections per probe. Probes with q values 
<0.05 were accepted as significant. Probes, which are significantly bound by 
ZRF1, were compared to those significantly bound by IgG to subtract IgG targets. 
ZRF1 targets were mapped to genes according to the information provided by 
Agilent. To study significant overlapping between genes bound by ZRF1 and genes 
bound by SUZ12, RING1B, H2Aub or the H3K27me3 mark, respectively, the 
enrichment analysis (EA) method was applied. The statistical significance (P 
value) was calculated using the binomial distribution. Significance levels were 
corrected for multiple comparisons with the Benjamini and Hochbert method. 
Functional enrichment analysis was performed with the DAVID software’’. 
RNA preparation and analysis by quantitative PCR. RNA was extracted with 
the RNeasy mini kit (Qiagen) and transcribed to cDNA by reverse transcription 
using the AMV kit (Roche). The expression of the respective genes was assayed by 
quantitative real-time PCR (Roche Lightcycler). As a reference, the expression of 
GAPDH or PUMI was measured for each experiment. The sequences of the 
primers are available upon request. 

GST pull-down. Purified GST-proteins were bound in equimolar amounts to 
glutathione beads (Amersham) in binding buffer (20 mM Tris pH 8.0, 150 mM 
NaCl, 0.5% NP-40). Loaded beads were washed in the same buffer and used for 
incubation with recombinant proteins for 2h at 4°C. For the competition assay 
(Fig. 2e) with recombinant ZRF1 and RING1B, the amounts of RING1B were kept 
constant and the amounts of ZRF1 were increased with every consecutive pull- 
down until finally reaching equimolar conditions. For preassembling RING1B- 
ubiquitin complexes (Fig. 2f), GST and GST-ubiquitin were bound to beads, 
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washed and incubated with RING1B at 4°C for 2h. Loaded beads were then 
incubated with a roughly tenfold higher amount of ZRF1-UBD together with an 
excess of BsA—where stated—for 90 min at 4°C. Finally, beads were washed 
intensively in binding buffer, denatured in SDS buffer, and subjected to electro- 
phoresis and subsequent western blotting analysis. 

Gel-filtration analysis. Gel-filtration was performed on an AEKTA-Explorer 
system (Amersham) using Superosel2 or Superose6 columns (Amersham). 
After calibrating the column with specific proteins, a solution of recombinant 
protein in PBS was injected and the UV-elution profile was detected. To verify 
each volume of elution the fractions were subjected to western blotting by probing 
with specific antibodies. 

Chromatin association assays. Cells were crosslinked with a solution of 1% 
formaldehyde in PBS for 10 min at 24 °C. Nuclei were prepared by resuspending 
the cell pellets in buffer A (100 mM Tris pH 7.5, 5 mM MgCl, 60 mM KCl, 0.5 mM 
DTT, 125mM NaCl, 300 mM sucrose, 1% NP-40). After lysis on ice the nuclei 
were pelleted and resuspended in buffer B (100 mM Tris pH 7.5, 1mM CaCh, 
60mM KCl, 0.5mM DTT, 125mM NaCl, 300 mM sucrose) and supplemented 
with 10 U of MNase I for 20 min at 37°C. The reaction was stopped by adding 
EDTA. The chromatin was pelleted and resuspended in buffer C (1% SDS, 10 mM 


EDTA, 50mM Tris pH 8.0) overnight at 4°C. After centrifugation (16,100g, 
2 min) the supernatant was used for western blotting. 

In vitro deubiquitination assays. Deubiquitination experiments were essentially 
performed as previously described”*. In short, mouse liver chromatin was incu- 
bated with no or increasing amounts of recombinant ZRF1. Subsequently USP21 
was added and reactions were incubated at 37 °C for 18 min. 
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Neurotransmitter/sodium symporter orthologue 
LeuT has a single high-affinity substrate site 


Chayne L. Piscitelli’*, Harini Krishnamurthy** & Eric Gouaux?? 


Neurotransmitter/sodium symporters (NSSs) couple the uptake 
of neurotransmitter with one or more sodium ions’ *, removing 
neurotransmitter from the synaptic cleft. NSSs are essential to the 
function of chemical synapses, are associated with multiple neuro- 
logical diseases and disorders’, and are the targets of therapeutic 
and illicit drugs’. LeuT, a prokaryotic orthologue of the NSS 
family, is a model transporter for understanding the relationships 
between molecular mechanism and atomic structure in a broad 
range of sodium-dependent and sodium-independent secondary 
transporters’ '*. At present there is a controversy over whether 
there are one or two high-affinity substrate binding sites in 
LeuT. The first-reported crystal structure of LeuT, together with 
subsequent functional and structural studies, provided direct evid- 
ence for a single, high-affinity, centrally located substrate-binding 
site, defined as the S1 site’*!>. Recent binding, flux and molecular 
simulation studies, however, have been interpreted in terms of a 
model where there are two high-affinity binding sites: the central, 
S1, site and a second, the S2 site, located within the extracellular 
vestibule’. Furthermore, it was proposed that the S1 and S2 sites 
are allosterically coupled such that occupancy of the 82 site is 
required for the cytoplasmic release of substrate from the S1 site’®. 
Here we address this controversy by performing direct measure- 
ment of substrate binding to wild-type LeuT and to S2 site mutants 
using isothermal titration calorimetry, equilibrium dialysis and 
scintillation proximity assays. In addition, we perform uptake 
experiments to determine whether the proposed allosteric coup- 
ling between the putative S2 site and the S1 site manifests itself in 
the kinetics of substrate flux. We conclude that LeuT harbours a 
single, centrally located, high-affinity substrate-binding site and 
that transport is well described by a simple, single-substrate kinetic 
mechanism. 

We first measured the thermodynamic response and stoichiometry 
of L-leucine binding to LeuT using isothermal titration calorimetry 
(ITC). To minimize the potential for artefacts in our binding assays 
arising from endogenously bound Leu co-purifying with LeuT, we 
extensively washed cell membranes with Na” -free buffer containing 
the Na* chelator 15-crown-5"”. For the wild-type LeuT-Leu inter- 
action, ITC binding isotherms were best fitted by a single-site model 
with a substrate-to-LeuT stoichiometry, N, of 0.70 + 0.01 and a dis- 
sociation constant, Ky, of 54.7 + 1.8nM (Fig. la and Supplementary 
Table 1). Binding of Leu to LeuT is driven by enthalpic and entropic 
factors with a AH of —3.93+0.02kcalmol ' and a —TAS of 
—6.01 + 0.13 kcal mol” '. Thermodynamic binding models of higher 
complexity describing two-site random- or sequential-binding modes 
yielded poorer fits to the data, with either high y* values or non- 
converging parameters. 

The measured stoichiometry, of 0.70, suggests that approximately 
30% of LeuT in the ITC cell is unable to bind titrated substrate. This 
could be due to incomplete removal of endogenously bound substrate 


despite extensive ‘washing’ of the membranes. To weaken substrate 
binding and thus diminish the relative proportion of Leu-bound LeuT, 
as well as to specifically probe the interaction of substrate with LeuT, 
we mutated Tyr 108 to Phe, thereby disrupting the hydrogen bond 
between the hydroxyl group of Tyr 108 and a carboxylate oxygen of 
substrate bound to the S1 site’*. We proposed that by ablating the 
hydrogen bond between Tyr 108 and Leu, the Tyr 108 Phe mutation 
would reduce the enthalpy of Leu binding to the S1 site, thus allowing 
us to isolate apo-LeuT more readily. 

Similar to the case for wild-type LeuT, the binding isotherm for Leu 
binding to the Tyr 108 Phe mutant was best fitted by a single-site 
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Figure 1 | Leu binding measured by ITC and equilibrium dialysis. a, b, ITC 
data for Leu binding to wild-type LeuT (a) and Leu binding to mutant 

Tyr 108 Phe-LeuT* (see Methods) (b). Raw injection heats (expressed as 
differential power) are shown in the top panels and the corresponding specific 
binding isotherms (calculated from the integrated injection heats and 
normalized to moles of injectant) are shown in the bottom panels, determined 
at 25 °C and pH 7.0. Square brackets denote concentration. c, d, Quantitation of 
[°H]Leu-binding stoichiometry by equilibrium dialysis for the wild type (open 
circle, solid line) or the Leu 400 Ala mutant (open triangle, dashed line) (c), and 
for Tyr 108 Phe-LeuT* (d). Errors bars, s.e.m.; n = 2. 
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model (Fig. 1b). Reflecting the predicted binding-site perturbation, the 
dissociation constant increased to Kg=1.4+0.1 uM; the stoichi- 
ometry parameter also increased (N= 0.79 + 0.01) relative to the 
wild-type transporter (Supplementary Table 1). Notably, AH 
decreased to —1.92 + 0.03 kcal mol !, a difference of 2.01 kcal mol! 
from wild-type LeuT and consistent with the loss of a single hydrogen 
bond between LeuT and a single substrate molecule bound at the S1 
site. 

Because the stoichiometry values from the ITC experiments ranged 
from 0.7 to 0.8, we were compelled to determine how much residual 
substrate remained bound to LeuT. To measure the amount of ‘free’ 
amino acid present in our LeuT samples, we employed quantitative 
amino-acid analysis (AAA). The qAAA results (Supplementary 
Tables 2-7) show that the molar ratio of free Leu to LeuT is approxi- 
mately 6% for Tyr 108 Phe but is negligible for the wild-type prepara- 
tions. The presence of more free Leu in the Tyr 108 Phe preparations 
was unexpected and may be due to variations in individual membrane 
preparations as well as variability in qAAA determinations. Even if all 
of the free Leu is bound to LeuT, however, the fraction of LeuT bound 
with substrate does not fully account for the substoichiometric values 
obtained from ITC. Possible explanations for the substoichiometric 
binding of substrate are that the LeuT samples used in the experiments 
contain trace amounts of contaminating proteins, as judged by SDS- 
polyacrylamide gel electrophoresis (Supplementary Fig. 1), that there 
is a small amount of protein aggregation, as judged by fluorescence- 
detection size-exclusion chromatography'® (Supplementary Fig. 1), or 
that a fraction of LeuT is not competent to bind substrate. 

To corroborate the binding parameters obtained by ITC, we used 
equilibrium dialysis to measure [*H]Leu binding to LeuT. Data for 
wild-type LeuT and the Tyr 108 Phe mutant were well fitted by a 
single-site binding equation (Fig. 1c, d) with respective stoichiometries 
of 0.73+0.03 and 0.72+0.02 (Supplementary Table 1). Taken 
together, both the ITC and the equilibrium dialysis data are consistent 
with there being a single substrate-binding site. The observed differ- 
ences between wild-type LeuT and Tyr 108 Phe demonstrate that we 
can use the LeuT crystal structure to perturb binding of substrate to the 
S1 site both specifically and predictably. 

We next probed the presence of the S2 site by asking whether 
mutations in this proposed site would also measurably perturb binding 
of substrate to LeuT. In fact, it is claimed that mutation of Leu 400 to 
Cys ablates Leu binding to the S2 site, reducing overall binding to LeuT 
by approximately one-half"’. We therefore measured [*H]Leu binding 
to mutants Leu 400 Ala and Leu 400 Cys. Using equilibrium dialysis, 
we observed that the extent of Leu binding to Leu 400 Ala was com- 
parable to that for the wild-type transporter (Fig. 1c and Supplemen- 
tary Table 1). This conclusion was reinforced using the scintillation 
proximity assay (SPA) method”’ to compare PH]Leu binding with 
wild type, Leu 400 Ala and Leu 400 Cys (Fig. 2a). We find that neither 
the Leu 400 Ala nor the Leu400Cys mutant shows any significant 
change in Leu binding, as measured by maximum binding capacity 
or dissociation constant, relative to wild-type LeuT (Supplementary 
Table 1). 

A limitation of the SPA method is the unreliable determination of the 
scintillant counting efficiency, which in turn complicates an accurate 
conversion of measured radioactivity in counts per minute to moles of 
radioligand. To circumvent the need for this transformation, we quan- 
tified the binding-site stoichiometry by titrating transporter protein at 
20-fold excess over Kg with 0.06-3.0 molar equivalents of PH]Leu”’. 
The resulting response is initially first order with respect to Leu con- 
centration, as binding sites are in excess over ligand. When binding 
reaches saturation, binding is zeroth order with respect to Leu concen- 
tration. The intersection abscissa of the first-order and zeroth-order 
linear regressions provides the ligand concentration equivalent to the 
binding-site concentration, thus defining a stoichiometric value that is 
independent of ordinate radioactivity conversions. Using this method, 
we measured [*H]Leu binding to wild-type LeuT and to the Leu 400 Ala 
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Figure 2 | Leu binding measured by scintillation proximity assays. 
a, Saturation binding isotherms and nonlinear regression analysis for wild-type 
LeuT (open circle, solid line), Leu 400 Ala mutant (open triangle, dashed line) 
and Leu 400 Cys mutant (open square, dotted line). c.p.m., counts per minute. 
b, Saturation binding at high LeuT concentration (~20Kgq), quantifying 
substrate-binding stoichiometry. Symbols and lines are as in a. c, Saturation 
binding for wild-type LeuT in the absence (same data as in a) or presence of 
1mM clomipramine (closed circle, dashed line). Error bars, s.e.m.; n = 2. 


and Leu400 Cys mutants (Fig. 2b). For each of these transporters, 
binding-site saturation occurs at a nearly identical ligand concentra- 
tion, each corresponding to a substrate-to-transporter stoichiometry of 
about 0.8, confirming that mutations at the Leu 400 position do not 
decrease the binding capacity of LeuT for Leu (Supplementary Table 1). 

We performed a final saturation binding analysis to assess the effect 
of clomipramine, an inhibitor of LeuT transport”'” that was proposed 
to displace Leu binding from the S2 site’®. We saw no change in the 
binding of Leu to wild-type LeuT in the presence of 10 nM LeuT and 
1mM clomipramine, thus indicating that Leu- and clomipramine- 
binding sites do not overlap (Fig. 2c and Supplementary Table 1). 
This is consistent with previous data indicating that clomipramine is 
a non-competitive inhibitor of LeuT transport”. 

To augment our assessment of binding-site stoichiometry, we next 
asked whether LeuT-catalysed transport is better modelled by a single- 
site kinetic model or one in which two substrates are bound. Previously 
reported flux measurements for several substrates showed that LeuT 
steady-state kinetics is well described by single-site Michaelis—Menten 
parameters. The overall slow turnover rate of LeuT under those con- 
ditions, however, may have obscured the detection of more complex 
kinetic behaviour. Here we sought to re-evaluate the kinetics of Ala 
transport under conditions tailored to promote higher turnover, to 
determine whether transport kinetics are better fitted by a one- or a 
two-site model. We first determined that uptake is more robust at low 
(acidic) pH values, with a maximum at pH 5, and that mutation of 
Lys 288, a residue protruding into the hydrophobic portion of the 
membrane bilayer", to Ala (LeuTS) further enhanced substrate flux 
(Supplementary Fig. 2). Steady-state kinetics for Ala uptake by LeuT* 
under optimized conditions was measured in the presence of a 200 mM 
inward Na” gradient. The data well fitted the Michaelis-Menten rect- 
angular hyperbola with a Michaelis constant of K,, = 0.79 + 0.06 UM 
and a maximal velocity of Vi,ax = 11,006 + 281 pmol min‘ mg | 
(Fig. 3a). The corresponding turnover number is k,., = 0.65 min’, 
which is about sixfold higher than that measured for wild-type LeuT 
at pH 7 with a 100 mM Na’ gradient!® (Supplementary Table 8). 

We reasoned that transport would be further stimulated by includ- 
ing valinomycin. Addition of this K* -selective ionophore will induce a 
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Figure 3 | Transport kinetics of (H]Ala uptake. a, Steady-state Ala uptake as 
a function of Ala concentration at pH 5. Inset, the corresponding Eadie- 
Hofstee plot with linear regression (? = 0.93). Error bars, s.e.m.; n = 4. 

b, Steady-state Ala uptake at pH 5 in the presence of valinomycin to induce a 
membrane potential. Inset, the corresponding Eadie—Hofstee plot with linear 
regression (7° = 0.96). Error bars, s.e.m.; 1 = 2. S, substrate. 


negative-inside membrane potential and prevent the build-up of posi- 
tive charge inside the liposomes during transport. With valinomycin 
present, k.at increased to 2.3 min ~ ‘ yet K,, remained nearly unchanged 
at 0.75 + 0.06 1M (Fig. 3b and Supplementary Table 8). Similar to 
transport under membrane-neutral conditions, valinomycin-stimulated 
transport is well fitted by a single-site Michaelis—Menten kinetic model. 

In conjunction with the Michaelis-Menten modelling, the steady- 
state kinetic data were fitted with alternative kinetic models that 
describe kinetic mechanisms involving two binding sites: the Hill 
equation” for a random-order, cooperative-binding response; and a 
two-site, ordered-binding kinetic model”. Data fitted to the Hill equa- 
tion converged with a Hill slope of ny = 0.96 + 0.03, indicating that 
there are not multiple interacting substrate sites underlying the kinetic 
behaviour of LeuT. A two-site, ordered-binding reaction scheme, 
which provides explicit treatment for both singly and doubly occupied 
transporter complexes™, was fitted to the flux data. Although Vina, was 
calculated to be 10,965 + 308 pmol min ? mg |, which is nearly 
identical to the Michaelis-Menten model, the apparent dissociation 
constant, Ks, and the dissociation coefficient, ~, converged to 
6.8 + 22nM and 114+ 360, respectively, indicating that the para- 
meters are not well fitted by the data. 

In conclusion, we have examined the stoichiometry of substrate 
binding to LeuT using multiple methods, and find consistent evidence 
for a single, high-affinity substrate-binding site. We find no evidence 
to support the notion that mutation of Leu 400 to Ala or Cys, or the 
presence of clomipramine, perturbs the stoichiometry of substrate 
binding. Furthermore, the kinetics of substrate flux is best fitted by a 
single-substrate kinetic model. Taken together, these data refute the 
two-substrate binding model for LeuT”* and are consistent with previ- 
ously determined crystallographic and functional data’*'*?'. The 
mechanistic implications of our work are that transport of substrate 
by LeuT occurs through a singly occupied intermediate where sub- 
strate is bound to a central, high-affinity site (the S1 site; Fig. 4). We 
maintain, however, that substrate may indeed transiently bind to weak, 
low-affinity sites as it transits from the extracellular solution to the S1 
site and from the S1 site to the intracellular solution, as suggested by 
previous structural and computational studies”. 
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Figure 4 | LeuT mechanism. Starting from the apo transporter in an open-to- 
outside conformation (a), substrate (S) and sodium ions bind, forming the 
outward-facing occluded conformation (b) characterized by closure of a ‘thin 
gate’ over the S1 substrate-binding site’. Clomipramine, which inhibits 
transport, binds in the extracellular vestibule*'”’ directly above the thin gate, 
near the putative S2 site’®. The substrate- and ion-bound transporter undergoes 
structural isomerization to form the inward-facing conformation (c), allowing 
release of substrate and ions to the intracellular solution, thereby generating an 
open-to-inside apo transporter (d) that isomerizes to the open-to-outside 
conformation (a). 


METHODS SUMMARY 


We washed membranes containing LeuT or mutants three times with buffer contain- 
ing 50mM Tris-HCl (pH 8.0) and 10mM 1,4,7,10,13-pentaoxacyclopentadecane 
(15-crown-5)"”, and purified them as described in either ref. 14 (ITC and equilibrium 
dialysis) or ref. 16 (SPA). For ITC experiments”, we determined the protein and Leu 
concentrations and the residual free-amino-acid content of purified LeuT by qAAA 
(Supplementary Tables 1-6). An extinction coefficient of 136,459cm~'M_ | was 
empirically determined by qAAA measurements of LeuT. We performed ITC 
experiments at 25°C using an ITC299 calorimeter (MicroCal) with either 20 uM 
wild-type or 30 uM Tyr 108 Phe LeuT in the cell, and titrated with 200 11M or 
500 4M L-Leu, respectively. Equilibrium dialysis experiments were performed by 
placing 100 il of 60 nM wild-type LeuT, 94nM Leu 400 Ala or 5.7 uM Tyr 108 Phe 
in the sample chamber of a Rapid Equilibrium Dialysis Device plate (Thermo 
Scientific) and 300 ul of [7H]Leu at 0.3-30 uM in the buffer chamber. Saturation 
binding experiments using SPA were performed with 10 nM protein incubated with 
2mg ml | Cu*-YSi SPA beads (GE Healthcare) in the presence of 0.3-600 nM 
(H]Leu. For measurement of binding-site concentration using SPA, we used 
400 nM protein and 25-1,200 nM [*H]Leu. For transport assays, LeuT proteolipo- 
somes were prepared as previously described"* in a 1:100 protein/lipid weight ratio. 
Transport assays were conducted at 27 °C with 101g ml ' protein. To determine 
steady-state kinetic parameters, we allowed reactions to proceed for 2 min and 
quenched, filtered and analysed them using GRAPHPAD PRISM 4 as previously 
described". 


Full Methods and any associated references are available in the online version of 
the paper at www.nature.com/nature. 
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METHODS 


Mutagenesis and protein purification. Site-directed mutants of LeuT were pre- 
pared using PCR. The Tyr 108 Phe mutant of LeuT was made in the background of 
the Lys 288 Ala mutation (Tyr 108 Phe-LeuT'). Wild-type LeuT and mutants 
bearing a carboxy-terminal Hisg tag were expressed in C41 cells and purified as 
previously described" with the exception that cell membranes were washed three 
times with 50mM Tris-HCl (pH 8.0) supplemented with 10mM_ 1,4,7,10,13- 
pentaoxacyclopentadecane (15-crown-5)’’ to facilitate the removal of bound Leu 
and augment the generation of apo-LeuT. Purified protein destined for equilibrium 
dialysis and ITC assays was concentrated to 5-30 11M using a concentrator with a 
50-kDa cut-off, and dialysed at 4°C for 24h against buffer I (20mM Tris-citrate 
(pH 7.0), 200 mM NaCl and 1 mM dodecyl maltoside), with three buffer changes. 
Protein for scintillation proximity assays was purified in buffer II'® (150 mM Tris- 
MES (pH 7.5), 50mM NaCl, 1mM dodecyl maltoside and 20% glycerol). 
Equilibrium dialysis assays on wild-type LeuT demonstrate no notable differences 
using either buffer I or buffer II. Reducing conditions were maintained for prepara- 
tions of the Leu400Cys mutant using 2mM tris(2-carboxyethyl)phosphine 
(TCEP). The concentration of protein and ligand used in the ITC measurements 
was directly determined by qAAA on material that was subjected to overnight acid 
hydrolysis in 0.02 N HCL. The extent to which the purified LeuT starting material 
was contaminated by residual Leu was determined by qAAA of non-hydrolysed 
material to measure the free-amino-acid content. All qAAA measurements were 
performed at the Keck Biotechnology Resource Laboratory at Yale University. All 
other protein concentrations were estimated by absorbance spectroscopy using a 
molar extinction coefficient of 136,459cm !M /at2 = 280 nm for the His-tagged 
protein, derived from the extinction coefficient predicted from primary sequence 
(ProtParam; http://expasy.org/tools/protparam.html) and empirically corrected by 
qAAA measurements of LeuT (Aygo of unity = 0.43 mg ml~'). Sample purity was 
assessed by SDS-polyacrylamide gel electrophoresis under reducing conditions 
using 12.5% Tris-Gly gels (Supplementary Fig. 1a). Protein dispersity was monitored 
by fluorescence-detection size-exclusion chromatography’* measuring intrinsic Trp 
fluorescence (Supplementary Fig. 1b). 

Isothermal titration calorimetry. A solution of wild-type LeuT or Tyr 108 Phe- 
LeuT (at 20 or 30 uM, respectively, in buffer I) was loaded into the sample cell of 
an ITC 09 calorimeter (MicroCal). L-Leu at 200 or 500 LM for titrations with wild- 
type LeuT or Tyr 108 Phe-LeuT‘, respectively, was dissolved in buffer I and 
loaded into the injection syringe. Before data collection, the system was equili- 
brated to 25°C with the stirring speed set to 1,000r.p.m. Titration curves for 
Tyr 108 Phe-LeuT* binding Leu were generated by five successive 1.5-1l injec- 
tions followed by fourteen 2.0-1l injections at 180-s intervals. Titration curves for 
wild-type LeuT binding Leu were generated with nineteen 2.0-1l injections at 180-s 
intervals. Control injections of ligand into buffer I without protein were done to 
determine background corrections. The integrated heats from each injection, nor- 
malized to the moles of ligand per injection, were fitted to a single-site binding 
isotherm” using ORIGIN 7. Final values of Kg, stoichiometry (N), AH and — TAS 
were determined from the average of two to four ITC runs. 

Equilibrium dialysis. For each replicate, 100 ul of either 60 nM wild-type LeuT, 
94 nM Leu 400 Ala or 5.7 1M Tyr 108 Phe-LeuT* protein in buffer I was placed in 
the sample chamber of a Rapid Equilibrium Dialysis Device plate (Thermo 
Scientific) and 300 pil of [*H]Leu at 0.3-30 uM (0.27 Cimmol ’) in buffer I was 
placed in the buffer chamber. The unit was covered with sealing tape and incu- 
bated at room temperature (23 °C) ona shaker for 6 h. To determine the concen- 
trations of total and free ligands, 10-1] aliquots were removed from the sample and 
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buffer chambers, respectively, and added to 6 ml of Ultima Gold scintillation fluid. 
The concentrations of free and total Leu were calculated from the tSIE (trans- 
formed spectral index of an external standard)-corrected d.p.m. (disintegrations 
per minute) values measured using a liquid scintillation counter. Data were ana- 
lysed as a single-site binding function. Values of Kg, Byyax (maximal binding) and N 
were determined from the average of two independent experiments, with two to 
four replicates each. 

Scintillation proximity assays. For saturation binding analysis, 10 nM LeuT was 
incubated with 2mgml~' Cu*-YSi SPA beads in buffer II in the presence of 
0.3-600 nM [*H]Leu (10.8 Cimmol '). The reactions were mixed on an orbital 
microplate shaker at room temperature. Plate readings were taken at 2, 20, 40 and 
60h using a Wallac Microbeta plate counter, although for each experiment no 
significant change was observed after 20 h incubation. SPA experiments to quantify 
the binding-site concentration in each sample were performed as described above, 
but using 400 nM LeuT and 25-1,200 nM [PH]Leu (10.8 Cimmol '). For all assays, 
specific binding was calculated by subtracting the background radioligand binding 
assessed by duplicate binding measurements in the presence of 5 mM L-Ala. 
Transport time course. LeuT was reconstituted into lipid vesicles as previously 
described" using internal buffer appropriate for the experiment (20 mM HEPES- 
Tris (pH 7), 200 mM KCl or 20 mM citrate-Tris (pH 6, pH 5 or pH 4) and 200 mM 
KC)). Transport reactions were assembled by diluting LeuT proteoliposomes to a 
final protein concentration of 10 jig ml’ in external buffer (20 mM HEPES-Tris 
(pH 7.0), 200 mM NaCl or 20 mM citrate-Tris (pH 6.0, pH 5.0 or pH 4.0) and 
200 mM NaCl) at 27°C with 500nM [PH]Ala (83 Cimmol '). Uptake was fol- 
lowed by removing and quenching 100-11 aliquots of the reaction in ice-cold 
internal buffer at various time points up to 40 min. Reactions were filtered and 
analysed as previously described'*. Non-specific uptake was assessed by repeating 
the time course for the same liposome preparation under identical conditions 
except for the replacement of external NaCl by KCl. Non-specific uptake was then 
subtracted from the total uptake measured to calculate the specific uptake. Each 
experiment was performed in duplicate. 

Steady-state kinetics. LeuT proteoliposomes at 10 jigml | were incubated with 
0.050-8.0 11M [?H]Ala (8.3Cimmol ') at 27°C for 2min in external buffer 
(20 mM citrate-Tris (pH 5.0) and 200 mM NaCl) with or without 50 nM valino- 
mycin. Preliminary time course experiments done with 0.050, 0.40, 1.0 and 8.0 uM 
(PH]Ala established that transport remained linear through the 2-min time point. 
Data from two to four experiments, each repeated in duplicate, were fitted to the 
Michaelis-Menten equation and analysed by linear regression to an Eadie- 
Hofstee transformation. 

To test for multisite cooperative kinetics, data were modelled according to the 
Hill equation, v = (VmaxlS]")/(K’ + [S]"), where v is the reaction velocity, n is the 
Hill coefficient, K’ is the apparent dissociation constant, allowing the parameters 
n, K' and Vinax to float. Alternatively, data were modelled by a two-site, ordered- 
binding equation, v= (Vymaxl S]7/aKs7)/(1 + [S]/Ks + [S}-/aKs”), allowing the 
parameters «, Ks and Vinax to float. To compare the Michaelis-Menten model 
with the ordered-binding model, the F-test statistic was calculated according to the 
GRAPHPAD PRISM manual using the following equation: F = ((SSpun — SSan)/ 
(DFyut — DFy1))/(SSa/DFa,), where ‘null’ and ‘alt’ refer to the Michaelis-Menten 
and ordered-binding models, respectively; SS is the absolute sum of squares of the 
variance for each model; and DF is the number of degrees of freedom for each 
model. For the Michaelis-Menten model, SS and DF were 3.564 X 10’ and 68, 
respectively. For the ordered-binding model, SS and DF were 3.579 X 10’ and 67, 
respectively. 
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Doha in Qatar with the Pearl Monument in the foreground. The city is host to branch campuses of renowned universities from around the world. 
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The growth of a desert jewel 


Qatar’s research machine is a work in progress, but its funding opportunities are already 
luring international scientists to its increasing number of institutions. 


BY WALEED AL-SHOBAKKY 


haled Machaca enjoys the high-risk, 
k high-reward aspects ofa start-up project. 
is latest is particularly demanding. 
Machaca has been tasked with establishing a 
research programme at a newly founded medi- 
cal college in Qatar: a small Middle Eastern 
country whose science enterprise, initiated only 
in the past decade, is itself a start-up of sorts. 
The challenges are manifold. Machaca has 
had to convince funders, the larger medical 
community and the public of the importance 
of his work. He has also had to source lab 
equipment in a place with few suppliers. To 
foster international collaborations, crucial to 
Qatari researchers’ success, he has had to help 
craft and customize a code of research ethics, 
created by Qatar’s Supreme Council of Health, 
that complies with both US and Qatari laws. 
And he has had to convince young scientists 
that they can advance their careers and con- 
duct cutting-edge science in a country known 


less for research than for hosting the news net- 
work Al-Jazeera and, as was announced this 
month, the 2022 soccer World Cup. 

“We had serious challenges,’ says Machaca, 
who is associate research dean at Weill Cor- 
nell Medical College in Qatar (WCMC-Q), 
based in Doha. But he relishes the notion of 
building a programme from scratch. And the 
country has a big advantage: money. Scientists 
working in Qatar will find good funding and 
ample opportunities for big projects, but, like 
Machaca, they might have to deal with rigid 
bureaucracy, evolving research-ethics regula- 
tions and rules — on stem-cell research, for 
example — that could limit collaborative ven- 
tures. These trade-offs will help to determine 
Qatar’s success as it attempts to build a sustain- 
able science enterprise. 


OASIS FOR RESEARCH 

Qatar’s efforts to be hospitable to science come 
amid a region-wide drive to engage with inter- 
national — mostly Western — academia and 


scientific centres. From the early 2000s to mid- 
2008 (when oil peaked at US$147 a barrel), fuel 
prices repeatedly hit record highs, bringing a 
petrodollar downpour to the six oil-rich Gulf 
states of Saudi Arabia, Kuwait, the United Arab 
Emirates (UAE), Qatar, Bahrain and Oman. The 
revenues have driven attempts to energize ail- 
ing higher-education and research centres and 
to create new ones. This coincided with belt- 
tightening among academic and research insti- 
tutions in Europe and North America — a trend 
that increased with the global financial crisis. 
Saudi Arabia is pushing ahead with a pro- 
gramme that pairs international research cen- 
tres such as Germany’s Max Planck Institutes 
with domestic universities to modernize local 
science departments. And 2009 saw the launch 
of the more ambitious, more visible King 
Abdullah University of Science and Technology 
(KAUST), a graduate-level research university 
in Thuwal with a US$10-billion endowment. In 
the UAE, Abu Dhabi campuses of major higher- 
education and research institutions such > 
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> as New York University and the Sorbonne 
University in Paris have been set up. And in 
2006, Qatari emir Hamad Bin Khalifa Al- 
Thani pledged that 2.8% of the country’s gross 
domestic product (which was about $100 bil- 
lion in 2008 accord- 
ing to the Economist 
Intelligence Unit in 
London) would be 
spent on scientific 
research. 

In general, govern- 
ment leaders in the 
Gulf states are setting 
up multibillion-dollar 
research projects and 
high-profile partner- 


ships not only because “We needed 
they can, but because international 
they must. “The Gulf researchers to 
countries are in a collaborate 
developmental phase”? with scientists 


in Qatar.” 
Abdul Sattar Al-Taie 


says Kristin Diwan, 
a Gulf expert at the 
American University 
in Washington DC. They see a genuine need, she 
says, for the “skills and learning that are required 
to run their economies” and to diversify beyond 
the rather volatile hydrocarbon sector. 

Thus far, sustained Qatari government fund- 
ing has helped Machaca to increase his diabetes 
and obesity research programme from a hand- 
ful of staff to 60 faculty members, postdoctoral 
fellows and research-support personnel in less 
than two years. He says that Qatar’s research 
and funding environment is preferable to what 
many would find in the United States, given 
that country’s recent science-funding woes. 

In Qatar, the job of luring high-calibre 
researchers goes to the Qatar Foundation for 
Education, Science and Community Devel- 
opment and its sprawling offshoot institu- 
tions, which include branch campuses of US 
and European institutions, domestic research 
centres and the Qatar National Research Fund 
(QNREF). This agency aims to attract researchers 
from far beyond the country’s borders to collab- 
orate with Qatar-based scientists on problems 
that the tiny state confronts — from diabetes 
to network security. The main QNRF grant 
mechanism, the National Priorities Research 
Program (NPRP), has over the first three years 
of its existence doled out close to a quarter of a 
billion US dollars on 266 research projects, with 
participants from more than 30 countries. 

That funding comes with conditions that are, 
at times, onerous. Collaborations must involve a 
Qatar-based researcher and most of the money 
must be spent in Qatar. The Qatar Foundation’s 
July 2010 request for proposals states that at least 
half of the proposed funded research days must 
be spent in Qatar; and at least 65% of the total 
annual budget must be used there. 

“We needed to make sure that international 
researchers collaborate with scientists based in 
Qatar while allowing at least 50% of the effort 


to be conducted inside Qatar,’ says Abdul Sat- 
tar Al-Taie, executive director of the QNRE 
“That should contribute to our main goal of 
building a research environment in the coun- 
try.’ Arguably, such provisions help to diversify 
Qatar’s economy for the post-hydrocarbon era 
by transferring knowledge from foreign coun- 
tries to researchers inside Qatar. 

But the rules mean that Qatar’s abundant 
funds can go only so far in attracting new 
research proposals — the small nation has a 
limited number of domestic collaborators. 
Eventually its science base will become “satu- 
rated”, says James Holste, former associate 
research dean at Texas A&M University in 
Qatar, the Doha branch of the institution based 
in College Station. At that point, investments 
will have drastically diminished returns. 

Already there are signs of strain. In the last 
NPRP cycle, Texas A&M in Qatar limited the 
number of proposals that faculty members 
could submit; its researchers qualify as Qatar- 
based, and demand for them from international 
institutions was so high that the school had to 
step in to regulate collaborations. “We were get- 
ting to the point where the labs were full and we 
had no place to put the people,” says Holste. 

Also laborious is the complex documenta- 
tion required by the NPRP to detail how grant 
money is spent. It can be difficult, for example, 
to secure extra funding for a project in progress, 
says Bernardine Dias, a robotics researcher at 
Carnegie Mellon University’s campuses in both 
Doha and Pittsburgh, Pennsylvania. Inflexibil- 
ity in a funding programme, says Holste, is “a 
kiss of death”. He advises the programme to 
mitigate excessive paperwork and documen- 
tation by affording its officers more discretion, 
according to the peculiarities of each project. 
One project might need more equipment; 
another might involve hiring postdocs to make 
up for the absence of PhD students (so far, no 
graduate programmes are offered in Qatar). 


CULTURE CLASH 

Machaca has experienced, first-hand, another 
challenge for scientists in Qatar: accommodat- 
ing research practices that may contrast with 
those in other nations. “Most of those differ- 
ences are culturally oriented,” says Machaca, 
noting, for example, the particular importance 
in Qatar of involving family members during 
the informed-consent process. 

Generally, where rules differ, researchers are 
expected to adopt the more stringent regula- 
tion. In such situations, the level of interna- 
tional collaboration may complicate matters. 

The Qatar Foundation has acknowledged the 
need to rethink how researchers report expenses 
and when extra funds should be allocated. The 
NPRP procedures can be strict, says Amer Al 
Saady, science adviser to the Qatar Foundation 
and a member of the QNRF’s steering commit- 
tee. “The QNRE is being cautious, or perhaps 
over-cautious, in adopting this attitude,” he says, 
emphasizing that the NPRP is only a few years 
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old. Complicating matters is a legacy of rare but 
infamous cases of plagiarism and abuse of funds. 
Caution will prevail, says Al Saady, until a more 
robust research culture has been established. 

Already the QNRF is tweaking the proc- 
ess. “Compared with the first cycle, the QNRF 
has improved a lot,” says Dias, who in the first 
NPRP cycle received funding for two proposals 
of about US$750,000 each. “And what is impor- 
tant is that it is taking feedback from the people 
who are submitting proposals.’ The foundation 
has asked the outgoing dean of Carnegie Mel- 
lon’s business school in Qatar to suggest ways of 
simplifying expenditure reporting and accom- 
modation of unforeseen expenses. 

At KAUST and the UAE's New York Univer- 
sity Abu Dhabi Institute (NYUAD), initiatives 
to engage Western researchers generally entail 
fewer restrictions. Outside researchers who 
receive funding from KAUST are not required 
to partner with Saudi Arabia-based researchers; 
usually, the university just requires grant recipi- 
ents to participate in a couple of seminars or 
workshops in the country each year, to report 
on their research findings and progress. 

KAUST and NYUAD do have their own 
constraints. KAUST awards its grants to faculty 
members in specific institutions with which the 
university has signed collaboration agreements; 
and NYUAD bestows funds only on eligible fac- 
ulty members who work full-time at New York 
University or NYUAD. NPRP grants are open 
to any researcher, from academia or industry. 

Qatar’s sizeable financial resources, mean- 
while, allow the QNRF to fund a relatively high 
proportion of proposals. In the most recent 
cycle, for example, its success rate was 23% for 
medical and health 
sciences, says Al-Taie. 
By comparison, the 
US National Institutes 
of Health accepted 
approximately 20.6% 
of proposals. As the 
number of appli- 
cants to the NPRP 
has increased, the 
foundation has had 
to decide whether to 


“The Qatar stay on budget and 
National turn down more pro- 
Research Fund posals, or increase 
has improved the budget and fund 
alot.” further investigators. 


Bernardine Dias The latter choice won 


out, says Al Saady. 

Qatar may offer funding opportunities, but 
that does not guarantee top recruits. A paucity 
of graduate students adds to the difficulties 
of keeping a lab. Currently, scientists at Doha 
campuses get graduate students from the main 
campuses on short-term contracts, one or two 
semesters, or hire postdocs to do the work that 
PhD students would usually do. 

Postdocs, too, may face challenges. Some 
principal investigators can make it difficult to 
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garner independence, says Rachel Jones, a 
postdoctoral fellow in biomedicine at the 
WCMC-Q. “Others give a looser rein and 
let their postdocs pursue their own interests 
as well as their supervisors,” she says. 

Jones credits support from her postdoc 
supervisor for helping her to pursue the 
QNRF’s Young Research Scientist Expe- 
rience Program, launched in May. She 
received an award of US$100,000 a year 
for up to three years, which she views as a 
bridge to more substantial grants. 

Many researchers would like the benefits 
that come with tenure, an option yet to be 
offered to faculty members hired at Doha 
branches (as opposed to those visiting from 
the US home campuses). This is partly 
explained by Gulf countries’ labour and 
immigration laws, which frown on recruit- 
ment options implying a right of permanent 
residence. “Everything is set up so people 
who are not citizens are encouraged to leave 
after five or ten years,’ says Holste. 

And Qatar and other Gulf countries 
rarely offer citizenship to expatriates. “A 
broad extension of citizenship rights to non- 
nationals would be extremely unpopular,’ 
says Diwan. Extensive state welfare pro- 
grammes in the Gulf, along with the delicate 
sectarian balance to be maintained between 
Sunni and Shi’a Muslims, render any pro- 
gramme to naturalize foreigners unpalatable 
to most. Foreigners wishing to be hired need 
a Qatari sponsor (who can be an individual, 
a firm or a government agency). And most 
contracts span two to five years. 

Officials at the branch campuses say that 
they are discussing options with the Qatar 
Foundation, and Al Saady notes that pro- 
posals to modernize labour laws are under 
way. Others in the region have proven faster 
than Qatar on the tenure front. The UAE, 
which has a thriving trade hub in Dubai, 
plenty of oil in Abu Dhabi and no sectarian 
divide, has more open labour and immigra- 
tion laws than most of its neighbours. 

Qatar’s litany of challenges has not dis- 
suaded enterprising researchers such as 
Machaca, who sees a long-term future in 
the small state. “As a scientist, what do I 
need? To do cutting-edge science, to pub- 
lish it, and hopefully in the long term to be 
able to commercialize it. And of course you 
want your family to be happy,’ he says. “Can 
I do this in Doha? Absolutely” m 


Waleed Al-Shobakky is a freelance writer 
based in Doha. 


CORRECTION 

The story ‘A helping hand’ (Nature 468, 
721-723; 2010) inadvertently implied 
that Anuj Kapadia is a clinical radiologist. 
He is an assistant professor of radiology. 


DIVISION OF LABOUR 
Proportion of US doctorate recipients with definite post-graduation employment 
commitments in the United States, by field and sector. 
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Drawn to academia 


Low salaries and elusive tenure didn’t dim the appeal of 
self-directed research for US scientists last year. 


BY KAREN KAPLAN 


espite a struggling economy, lower 
D salaries and an increase in adjunct and 

contingent positions, a higher propor- 
tion of US scientists headed to academia than 
to any other sector in 2009, according to num- 
bers from the US National Science Foundation 
(NSF). Published on 3 December, Doctorate 
Recipients from US Universities: 2009 includes 
salary data for the first time in the annual 
report's 43-year history. 

Even with universities offering much lower 
salaries than industry, half of all life-sciences 
PhD recipients who had secured jobs said that 
they were entering academic positions, accord- 
ing to the survey. This proportion, which has 
varied little since 1989, is a testament to the pow- 
erful lure of positions that enable self-directed 
research, say analysts. “Many scientists want the 
independence of working on their own research, 
rather than on what’s handed to them,” says 
Mark Fiegener, an NSF programme manager 
based in Arlington, Virginia. The NSF received 
survey responses from 420 US universities and 
49,562 PhD recipients. 

The industrial sector proffered the high- 
est median early-career salaries — up to 
US$95,000 in some instances — in most dis- 
ciplines in the physical and life sciences. The 
median for an academic post in biological 
sciences was $45,000, compared with $85,000 
for a commercial position in the same subfield, 
including biochemistry, marine biology and 
zoology. Other fields had similar disparities. 


Academia dominated life-sciences employ- 
ment in 2009, but industry was stronger for 
physical scientists, despite changes to job num- 
bers since 2008 that run counter to five-year 
trends and could be due to pharmaceutical lay- 
offs (see ‘Division of labour’). Richard Freeman, 
an economist at Harvard University in Cam- 
bridge, Massachusetts, attributes the five-year 
trend in part to hiring increases at drug-making 
and chemical firms. He says that mergers and 
layoffs will continue to slow the field down. 

Industry’s constraints will put pressure on 
academia, which is already pinched by the 
recession, says Marc Bousquet, an associate 
professor at Santa Clara University in Cali- 
fornia who is on the executive council of the 
American Association of University Professors. 
Scientists in all fields will struggle to find aca- 
demic posts — and few of those available will 
be tenure-track, he says. 

The report also uncovers significant pay dif- 
ferences between early-career men and women 
with PhDs. Men earned up to $10,000 more than 
women in nearly all fields except astronomy, 
where they earned $30,000 more. Joan Herbers, 
president of the Association for Women in Sci- 
ence in Alexandria, Virginia, says women need 
help learning to negotiate salaries and raises. 
“When you start out at a lower salary,’ she says, 
“that dogs you for the rest of your career” 

Postdocs earned $37,500 to $45,000, which, 
given their average schedule, Freeman estimates, 
works out to $12.50 to $15 an hour. “Some of the 
best and brightest people in our country earn a 
pittance,’ says Freeman. = 
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BY SIMON QUELLEN FIELD 


r | he little man opened the door and 
stepped into Schmidt's office. 
“Who let you in here?” asked the 
surprised Schmidt. 

“T just did? the little man said, pointing 
to the door. 

“But that’s my bathroom,’ Schmidt said, 
rising from his chair. 

“No matter,’ said the little man. “In a 
moment, you wont care. Because Iam about 
to give you the most amazing thing you have 
ever seen in your life?” 

He held out his hand, on which there sat a 
small blue sphere that seemed to shimmer. 
Schmidt was about to protest when the little 
man touched the sphere and pulled on it. 
It grew as it followed his gesture, until it 
was a large globe, the continents and 
oceans easily recognizable, clouds 
moving slowly across the surface. 
Schmidt stopped and stared. It 
was so lifelike. He could see three- 
dimensional details in the land- 
scape, even birds and aeroplanes 
as the view got closer. 

“We call this the Simulation,” the 
little man said. “It’s quite realistic. It 
uses inputs from satellites, of course, 
but also from all kinds of cameras 
all over the world, cell phones, traffic 
cameras, webcams, television. It’s quite 
up-to-date. You can zoom in on anything 
you like.” 

He gestured again, and Schmidt 
felt a dizzy sensation as the view 
swooped down through clouds to 
view a city, and then farther down 
to view a street corner with busy traffic 
and pedestrians, all moving and in perfect 
3D. He could move his head and see behind 
people and objects. He felt he could reach in 
and touch things. 

“How do —?” Schmidt began. 

“It’s a simulation,” the little man said. 
“There's data input, but most of it is gener- 
ated. Computers, you know.” 

The view changed as the little man made 
subtle movements with his hands. Schmidt 
seemed to fly through walls, observing peo- 
ple in their homes and at work, going about 
their routines. A woman brushing her teeth 
in front of a mirror. A couple arguing at a 
table ina café. A seductive woman trolling a 
bar in Paris. A fisherman struggling with a 
line in Australia. 

“It’s extremely popular where I come 


RECURSION’ 


Worlds within worlds. 


from,” said the little man. “People fly all 
around, spy on people, hang around women’s 
locker rooms, it’s highly addictive. Hardly 
anything else gets done. People stop talking 
to each other, stop going to work, they’re just 
fascinated.” 

Schmidt himself was getting fascinated. It 
looked so real. He reached his hand out and 
the sphere responded, moving the scenes 
around as he gestured. He felt like he was fly- 
ing, swooping between buildings and under 
bridges, peering into windows, moving 


through solid walls like a ghost. He peeked 
into corporate boardrooms and spied on 
meetings in the Kremlin. 

“But that’s not all? the little man said. “You 
can go in.” He zoomed in on a doorway, until 
the door was life-sized in front of them. “Any 
door you like, you just open it and walk in” 

He reached for the doorknob, and turned 
it, pushing the door open. Schmidt looked 
in, and saw himself in a room that looked 
just like his office, standing next to a little 
man with a doorknob in his hand. He swung 
around and looked at 


the door to his bath- DNATURE.COM 
room, which was now _ Follow Futures on 
open, andhe could see Facebook at: 
himself looking back. go.nature.com/mtoodim 
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“How —?” he started to ask. 

“Cute trick, eh?” the little man said, clos- 
ing the door. “You can forget your corporate 
jet. Anywhere you want to go, you just open 
the door. That’s how I got here, of course.” 

“That can't be real; Schmidt said, shaking 
his head. 

“No, it isn’t,” the little man replied. “Like 
I said, it’s a simulation. All done by com- 
puters. Collecting and organizing all the 
world’s information, and presenting it ina 
nice three-dimensional user interface, with 
natural intuitive gestural inputs. Anyone can 
learn to use it in seconds, it needs no user 
manual.” 

“And youre giving this to me?” Schmidt 
asked, his gaze still held by the device, his 
hands still moving to direct the view. 

“Free of charge,’ the little man said. “No 

catch, it’s all yours.” 

“I. can see why people get addicted to 
this,” Schmidt said. 
“Yes, that was a problem. Economy 
went into the crapper, people stopped 
having kids, food became scarce, things 
were really going downhill until we 
came up with this solution” 
“What solution was that?” Schmidt 
asked absently, his attention still riveted 
on the device in his hands. 

“A computer virus,” the little man said. 
“Ingenious, really. It’s called infinite recur- 
sion. Like putting two mirrors facing each 
other, so you get a hallway stretching on 

forever. We put a Simulator inside the 

Simulator, and the computers spend 

all their time simulating more simula- 

tions, until they don’t have any time to 

do anything else. Everything grinds to a 

halt after a little while. The toy isn’t fun any- 
more, and people get back to their lives.” 

“Tm not sure I understand,” Schmidt 
said. 

“Give it a minute or two,’ the little man 
said. He gestured, and the view zoomed in 
on Schmidt's office, showing the two men 
gazing at the sphere. Inside the sphere, two 
copies of the men were staring at another 
sphere. “It will come to you,” he said. “Or 
maybe not.” = 


Simon Quellen Field is the chief executive 
of Kinetic MicroScience, where he designs 
scientific toys and writes books about 
science, as well as novels in science fiction, 
mystery and suspense. 


1. Field, S. Q. Nature 468, 1138 (2010). 


JACEY 


LR 


doi:10.1038/nature09593 


The assembly of a GTPase- 


kinase signalling complex 


by a bacterial catalytic scaffold 
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The fidelity and specificity of information flow within a cell is 
controlled by scaffolding proteins that assemble and link enzymes 
into signalling circuits’’. These circuits can be inhibited by bac- 
terial effector proteins that post-translationally modify individual 
pathway components**. However, there is emerging evidence that 
pathogens directly organize higher-order signalling networks 
through enzyme scaffolding”*, and the identity of the effectors 
and their mechanisms of action are poorly understood. Here we 
identify the enterohaemorrhagic Escherichia coliO157:H7 type III 
effector EspG as a regulator of endomembrane trafficking using a 
functional screen, and report ADP-ribosylation factor (ARF) 
GTPases and p21- -activated kinases (PAKs) as its relevant host 
substrates. The 2.5 A crystal structure of EspG in complex with 
ARF6 shows how EspG blocks GTPase-activating-protein-assisted 
GTP hydrolysis, revealing a potent mechanism of GTPase signal- 
ling inhibition at organelle membranes. In addition, the 2.8 A 
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crystal structure of EspG in complex with the autoinhibitory Ia3- 
helix of PAK2 defines a previously unknown catalytic site in EspG 
and provides an allosteric mechanism of kinase activation by a 
bacterial effector. Unexpectedly, ARF and PAKs are organized 
on adjacent surfaces of EspG, indicating its role as a ‘catalytic 
scaffold’ that effectively reprograms cellular events through the 
functional assembly of GTPase-kinase signalling complex. 

To identify new signalling pathways targeted by bacterial pathogens, 
we used a human growth hormone (hGH) secretion assay” to measure 
the ability of type III and type IV effector proteins to regulate vesicle 
trafficking through the general secretory pathway (Fig. la, b). 
Consecutively, each bacterial effector was tagged with enhanced green 
fluorescent protein (eGFP) and assessed for localization at host orga- 
nelles (Fig. 1b). We noted that several type III effectors encoded by the 
extracellular pathogen enterohaemorrhagic E. coli (EHEC) O157:H7 
inhibited host trafficking events, whereas effectors secreted by 


c¢ eGFP-EspG 


GM130 Merge 


Cytoplasmic 
Membrane 
Cytoplasmic 
Cytoplasmic 
Membrane 
Cytoplasmic 
Cytoplasmic 
ER 

ER 
Cytoplasmic 
Cytoplasmic 


Golgi 


vy 


crea cE GE 
ma 


Golgi Microtubules 


e ARF GTPase Salmonella 


THERE: 
ARF 15 TER 81 
18 175 


m 

n 
no] 

@ 


= 
5) @ 
m 


ARF6 


GST-ARF1 
GST-ARL1 
GST-ARL2 


p21-activated kinase 

{7D AUTOMATOR p75 nn Kinase) 
PAK1 23— 544 
PAK2 13 — 
PAK2  47=! 
PAK2 
PAK2 
PAK3 74= 


EspG (MalE) 


162 


ARF/ARL 


103 
120mm 015 


186 


Figure 1 | EspG inhibits endomembrane trafficking and disrupts Golgi 
architecture. a, hGH trafficking assay showing how the hGH-FKBP* 

(Phe 36 Met mutant) aggregates in the endoplasmic reticulum until drug 
application (AP21998), whereby hGH enters the general secretory pathway 
and is secreted into the culture medium. b, hGH release assay showing the 
effects of type III and type IV effector proteins on trafficking through the 
general secretory pathway (Methods). hGH was quantified by enzyme-linked 
immunosorbent assay and normalized to GFP control (Drug) experiments. 
The subcellular localization of eGFP-tagged effectors is indicated. 
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ER, endoplasmic reticulum. c, Co-localization of eGFP-EspG (green) with 
cis-Golgi matrix protein GM130 (red). The Golgi in untransfected cells 
appears as tightly associated cisternae. d, Golgi and microtubule phenotypes 
induced by EspG protein microinjection (asterisk). The percentage of 
microinjected cells exhibiting each phenotype is indicated (n = 3, from >40 
cells per experiment). e, f, ARF GTPase (e) and PAK isoforms (f) that interact 
with EspG by yeast two-hybrid. g, h, Glutathione pull-down of GST-ARF 
isoforms (g) and GST-PAK1 fragments (h) with recombinant MalE-tagged 
EspG. 
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Salmonella Typhimurium, Legionella pneumophila and Bartonella 
henslae showed little inhibitory functions, consistent with their intra- 
cellular life cycles (Fig. 1b). In particular, the EHEC type III effector 
EspG blocked exocytosis of hGH through an unknown molecular 
mechanism (Fig. 1b). eGFP-tagged EspG localized to the cis-Golgi 
apparatus, where it induced severe fragmentation of the organelle 
(Fig. 1c, d and Supplementary Fig. 1a, b). The Golgi disruption pheno- 
type was observed when 10 nM recombinant EspG protein was micro- 
injected into cells to mimic the protein concentration delivered by 
E. coli through the type III secretion apparatus”® (Fig. 1d). In addition, 
EspG disrupted the recycling endosome compartment in both trans- 
fection (Supplementary Fig. 1c, d) and microinjection experiments 
(data not shown). Previous genetic studies have implicated EspG'!” 
and related Shigella family members’’ in microtubule depolymeriza- 
tion. However, microtubules were intact in EspG microinjected cells 
(Fig. 1d), consistent with previous reports showing that these effectors 
do not disrupt cytoskeletal architectures’*'’. Thus, EspG represents a 
new class of bacterial signalling effector that functionally regulates 
cargo trafficking from membrane organelles. 

We used a yeast two-hybrid screen to identify host enzymes targeted 
by EspG. The screen resulted in 26 positive interactions with multiple 
overlapping complementary DNA clones expressing two ARF GTPase 
family isoforms (ARF1 and ARF6) and three p21-activated kinase 
family members (PAK1, PAK2 and PAK3) (Fig. le, f). ARF GTPases 
function within a broad range of organelle systems, where they organ- 
ize vesicle transport machinery, phospholipids and signalling mole- 
cules at membrane microdomains'*’”, whereas the PAK family of 
serine/threonine kinases transduce Cdc42 and Racl GTPase signals 
that establish intracellular polarity’®. Direct interactions between EspG 
and the GTPase domain of ARF family members (Fig. 1g) and the 
autoinhibitory domain (AID) of PAK kinases (Fig. 1h) were shown by 
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Figure 2 | The structure of EspG in complex with GTP-bound ARF6. a, The 
overall structure of EspG-ARF6¢rp complex. EspG is shown in cyan and ARF6 
in green. Switch I and switch II on ARF6 are coloured orange and red, 
respectively. b, EspG selectively binds the GTP-loaded ARF1 and ARF6 (GST 
tagged) in glutathione pull-down assays. The native lane represents ARF 
GTPases purified from bacteria without removing or loading specific 
nucleotides. c, Structural overlay of EsSpG-ARF6g¢rp and ASAP3(GAP)- 
ARF6gpp.atrx (Protein Data Bank ID, 3LVQ) showing how EspG sterically 
hinders ARF binding to ASAP3-GAP. The catalytic Arg finger of ASAP3 is 
labelled. d, GIP hydrolysis assay showing that EspG inhibits GAP-assisted 
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in vitro binding studies using purified recombinant proteins. These 
findings establish two EspG substrates that are consistent with its 
regulatory function in host protein trafficking identified here and in 
bacterial infection studies conducted in vivo”. 

Next we crystallized EspG (residues 42-398) in complex with the 
GTPase domain of ARF6 (residues 13-175) and solved the structure to 
a resolution of 2.5 A (Supplementary Table 1). EspG buries 602 A’ of 
ARF6 surface area and the complex interface is mediated by a collab- 
oration of EspG loops (loops connecting 85 and 86, B8 and «6, and 
B12 and £13) that specifically engage the switch I loop of ARF6 and 
several residues lining the guanine-nucleotide-binding pocket (Fig. 2a 
and Supplementary Fig. 2a, b). The conformational state and amino- 
acid sequence of switch I are highly conserved between ARF family 
members, indicating that the EspG-ARF6 structure illustrates the nat- 
ure of EspG’s interaction with several ARF isoforms (Supplementary 
Fig. 3a, b). The importance of conserved switch I residues for binding 
EspG were confirmed by mutational analysis on ARF6 (Supplementary 
Fig. 3c). 

ARF6 is GTP bound in the crystal and adopts an active-state con- 
formation nearly identical to that of ARF6grpys (ref. 21; 
Supplementary Fig. 4a). Further structural analyses revealed that 
switch I is inaccessible to EspG when ARF6 adopts the GDP-bound 
conformation (Supplementary Fig. 4a). EspG selectively bound the 
GTP-loaded forms of ARF1 and ARF6 but did not recognize GDP- 
ARF complexes (Fig. 2b). Moreover, EspG interacted with ARF6¢yp in 
its full-length myristoylated form, which was isolated from membrane 
fractions (Supplementary Fig. 4b). Thus, EspG preferentially targets 
the active ARF¢yp signalling molecule. 

COPI coat, vesicle complex adaptors and signalling enzymes prim- 
arily associate with switch 2 and the B2/3 interswitch of ARFg7rp (refs 
22-26; Supplementary Fig. 5a). Given the frequent occurrence of this 
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GTP hydrolysis on ARF1. The rate of y**P[GTP] hydrolysis was measured as 
the percentage of y°*P[GTP] remaining on ARF over time. Intrinsic ARF1 
GTPase activity (control, green), GAP-stimulated activity (GAP, blue triangle), 
and EspG inhibition of GAP activity (EspG + GAP, open diamond) or mutant 
EspG Glu 392 Arg (open circle) are shown. e, Time course of the Golgi 
disruption phenotype presented as the percentage of microinjected cells with 
altered Golgi morphology as shown in Fig. 1d. At least 45 microinjected cells 
were scored in each trial for a Golgi disruption phenotype, and the data are 
representative of three experimental trials. BFA, brefeldin A. 
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binding mode, we were surprised to find that EspG is rotated away 
from these common binding elements and is positioned directly over 
the guanine-nucleotide-binding pocket (Fig. 2a, c). Surprisingly, how- 
ever, EspG does not function as a guanine nucleotide exchange factor 
(Supplementary Fig. 5b, c) or a GTPase-activating protein (GAP) 
(Fig. 2d, cyan circles). Rather, EspG is appropriately positioned to 
hinder binding of ARF-GAP and its catalytic access to the y-phos- 
phate of GTP (Fig. 2c). EspG completely abolished the GAP-stimu- 
lated GTPase hydrolysis on lipid-anchored ARF1 (Fig. 2d, diamonds), 
in comparison with the fast ARF-GAP reaction (Fig. 2d, blue trian- 
gles). The inhibition of GAP by EspG relied on a direct interaction 
between EspG and ARF because the binding-deficient mutant EspG 
Glu 392 Arg (characterized in Supplementary Fig. 6) had no affect on 
GAP-stimulated hydrolysis (Fig. 2d, open circles). 

GTP hydrolysis and exchange on ARF is required for proper mem- 
brane transport functions, suggesting that EspG inhibits Golgi traf- 
ficking by blocking its guanine nucleotide cycle’””’. Several lines of 
evidence support this idea. First, EspG disrupted the Golgi complex 
with rapid inhibitory kinetics (Fig. 2e) and a phenotype similar to the 
fungal toxin brefeldin A (Supplementary Fig. 7a), a potent ARF1 
GTPase inhibitor that also interferes with the guanine nucleotide 
cycle’’. Second, microinjection of dominant-negative ARF protein 
(ARFAN13) caused a significant delay in Golgi disruption induced 
by EspG (Fig. 2e, open circles). Third, EspG Glu 392 Arg, a mutant 
that does not interact with host substrates (Supplementary Fig. 6), had 
no affect on Golgi morphology or trafficking function (Fig. 2e and 


Figure 3 | The structure of EspG in complex with PAK2 Ia3 peptide. a, The 
overall EspG-PAK2" complex with EspG oriented and coloured as in Fig. 2a. 
The PAK2 Ia3 peptide (residues 123-134) are shown in magenta. b, Detailed 
interactions between EspG and PAK2™. Key binding residues from EspG (blue 
labels) and PAK2 (black labels) are shown. c, Close-up view of autoinhibited 
PAK1 homodimer (Protein Data Bank ID, 1F3M) focused on chain B (kinase 
domain, blue) and chain D (autoinhibitory domain, yellow). The Ia3-helix 
inhibitory functions are labelled (i)-(iii) corresponding with those outlined in 
the results section. The I«3-helix extracted from the PAK1 structure 
(numbering corresponds to PAK2 for ease of comparison) is shown at the 
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Supplementary Fig. 7b, c). Finally, EspG co-localized with the ARF1 
effector B-COP (ref. 26) on Golgi membranes (Supplementary Fig. 1b). 
These combined structure and cellular studies provide a mechanism 
for bacterial regulation of membrane trafficking: EspG prevents vesicle 
transport by directly inhibiting ARF guanine nucleotide turnover on 
host membranes. 

Having established the mechanism of ARF GTPase regulation, we 
next explored a second possible function of EspG: regulation of PAK 
family kinases. The EspG-binding site on PAK2 was defined to resi- 
dues 121-136, a highly conserved sequence that encodes the Ia3-helix 
within the kinase AID (Supplementary Fig. 8). We crystallized EspG in 
complex with the PAK2 Io3-helix fragment and solved the structure to 
a resolution of 2.8A (Supplementary Table 1). EspG recognized the 
initial turn of the Ia3-helix whereas the remainder of the peptide 
adopted an extended strand conformation that lies orthogonal to the 
EspG six-stranded f-sheet (Fig. 3a, b). EspG buries 684 A’ of the PAK2 
surface area and the binding is primarily supported by a large hydro- 
phobic interface and hydrogen bonding by residues Asn 212 and 
Asn 323 of EspG (Fig. 3b). This structural interface was confirmed 
by a series of in vitro binding studies and kinase assays using PAK2 
and EspG mutant proteins (Supplementary Fig. 9). 

To determine how EspG may regulate the kinase, we compared the 
peptide structure from EspG-PAK2 with the structure of Ix3-helix in 
the autoinhibited PAK1 homodimer (Fig. 3c). In autoinhibited 
PAKs, the Ia3-helix is sandwiched between the kinase domain and 
the AID, where it has three autoinhibitory functions: (i) it folds onto 
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upper right. The corresponding PAK2 Iu3-helix extracted from the EspG 
structure (lower right) is oriented by the amino-terminal helical residues 123- 
127. KI loop, kinase inhibitory loop. CRIB, Cdc42/Racl interacting binding. 
d, e, PAK2 kinase assays comparing 2 11M EspG with equimolar GTPyS-loaded 
Cdc42 (d) and the indicated EHEC type III effectors (e). Phosphorylation of 
myelin basic protein (MBP) substrate, input levels of PAK2 and quantification 
of each experiment are shown. f, PAK2 kinase assays comparing autoinhibited 
wild-type (WT) PAK2 with PAK2 mutants Val 123 Asp and Phe 129 Asp. Data 
are presented as in d. 
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the EF- and G-helices to block substrate binding; (ii) it positions the 
‘Kinase inhibitory’ loop across the enzyme catalytic cleft; and (iii) it 
stabilizes the AID three-helix bundle maintaining PAK homodimer- 
ization (Fig. 3c). Hence, EspG binding to the I«3-helix suggested a 
mechanism for PAK activation. EspG induced a (7.6 + 2.5)-fold 
(n = 11) increase in PAK2 activity, a profile that is comparable to 
PAK stimulation by GTPyS-loaded Cdc42 ((7.8 + 3.4)-fold, n = 3) 
(Fig. 3d). Notably, EHEC type III effectors Map and EspFy showed 
no PAK stimulatory activity, demonstrating the signalling specificity 
of EspG (Fig. 3e). 

To gain further insight into the mechanism of kinase activation, we 
first examined the details of the EspG-PAK2 interface. EspG residue 
Asn 212 probably initiates the kinase reaction because this residue 
engages the surface-accessible residues Asp 131 and Ser 132 in the 
autoinhibited PAK homodimer (Fig. 3b and Supplementary Fig. 10). 
On initial recognition, EspG displaces the Iu3-helix by reorganizing its 
secondary structure and by displacing side chains that normally con- 
tact the autoinhibitory interface between the kinase domain and the 
AID (Fig. 3c). These data suggest a novel allosteric mechanism for 
PAK activation. To further confirm that PAK is stimulated by local 
perturbations in the environment surrounding Ia3, we mutated the 
hydrophobic PAK2 residues Val 123 and Phe 129 that stabilize the 
AID and the kinase domain, respectively. Both Val123 Asp and 
Phe 129 Asp resulted in a constitutively active kinase with more than 
60-fold enhancement of substrate phosphorylation (Fig. 3f). We note 
that the mechanism of EspG binding to PAK is structurally distinct 
from that of Cdc42 binding”, indicating that the catalytic machinery 
of EspG is a unique bacterial invention. 

The two EspG structures reported here are nearly identical, with a 
root mean squared deviation of 0.612 A over 349 Co: atoms. As shown 
in Fig. 4a, ARF6 and PAK2 occupy distinct, non-overlapping binding 
sites on adjacent surfaces of EspG. Consistent with this view, EspG 
nucleates a trimeric complex between the kinase and GTPase in solu- 
tion (Supplementary Fig. 11). This complex could also be reconstituted 
on Golgi mimetic liposomes (Fig. 4b). ARFl Grp recruited EspG to the 
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Figure 4 | EspG functions as a catalytic scaffold at membrane organelles. 
a, Structural overlay of EspG-ARF6c¢yp and EspG-PAK2" highlighting the 
close association between ARF and PAK on the surface of EspG. Colours are as 
in Figs 2a and 3a except that EspG from the PAK2 structure is coloured purple. 
b, Golgi-mimetic-liposome-binding assays showing that EspG nucleates a 
trimeric complex between ARF1 and PAK2 on membrane surfaces. After 
centrifugation, proteins remaining in the supernatant (S) or those associated 
with liposomes in the pellet (P) are indicated. c, HEK239A cells co-transfected 
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artificial membrane surface (Fig. 4b, lane 4), which in turn localized 
PAK2 to these sites (Fig. 4b, lane 6). Notably, PAK2 localization was 
strictly dependent on formation of the EspG-ARFlgrp complex 
(Fig. 4b, lanes 6-8) and ARF1 tethering to the membrane (Fig. 4b, 
lanes 9 and 10). As predicted by these findings, EspG co-localized with 
ARF 1 at the Golgi (Fig. 4c). We further speculated that PAK would also 
be recruited to these sites. To test this hypothesis, an in vivo ‘activity’ 
probe was engineered by fusing the PAK2 Iu3-helix sequence (residues 
121-136) to the carboxy terminus of the mCherry fluorophore. The 
PAK2 probe recognized cellular EspG and was targeted to the Golgi 
complex in 78 + 5% of EspG-transfected cells (Fig. 4c). By compar- 
ison, mutant EspG Asn 212 Ala that lacked all kinase stimulatory activ- 
ity (Supplementary Fig. 9) localized to the Golgi complex but did not 
recruit PAK2 to these sites (Fig. 4c). Together, our studies support the 
function of EspG as a catalytic scaffold that links GTPase inhibition 
with kinase signal transduction pathways at membrane organelles 
(Fig. 4d). 

EspG belongs to large family of type III effectors secreted by diverse 
bacterial pathogens. Our studies show that EspG has structural 
homology with VirA (refs 14, 15) from Shigella flexneri, suggesting 
that it too may function as an enzyme scaffold (root mean squared 
deviation, 3. 1A; Z-score, 5.9) (Supplementary Fig. 12). However, a 
detailed structural comparison indicates that VirA is unlikely to tar- 
get the same signalling pathways as EspG during Shigella pathogen- 
esis (Supplementary Fig. 12). We provide mechanistic insights and 
structural evidence that EspG harbours two unique pathogenic activ- 
ities, ARF GTPase inhibition and PAK stimulation. Moreover, EspG 
targets PAK to specific membrane surfaces through its association 
with ARFs. From a strategic point of view, the assembly of an artificial 
enzyme complex enables bacteria to precisely control signalling 
events with little competition between endogenous regulatory sys- 
tems. Thus, it is intriguing to speculate that EspG organizes a higher- 
order signalling network to effectively subvert key cellular processes 
including cell polarity, adhesion, receptor trafficking and protein 
secretion. 
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with the indicated constructs showing that eGFP-EspG co-localizes with 
ARF1-mCherry and recruits a PAK activity probe (mCherry-PAK2'*?"1°°) to 
Golgi membranes. The percentage of cells exhibiting co-localized EspG with 
mCherry-tagged proteins (n = 3) is shown in the upper right of the merged 
micrographs. d, Model of the dual function of EspG as an inhibitor of 
membrane trafficking and as a catalytic scaffold that assembles a GTPase- 
kinase signalling complex at cellular membranes. GEF, guanine nucleotide 
exchange factor. 
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METHODS SUMMARY 


Recombinant protein preparation and cloning were done using standard methods. 
In immunofluorescence experiments, we transfected cells with Fugene 200 or 
microinjected them with 10 nM protein, where indicated. We performed kinase 
assays with rabbit pEGFP-PAK2, which was immunoprecipitated from 293T cell 
lysates and incubated with the protein of interest in the presence of MBP, 10 1M 
ATP and 5 Ci y°*P[ATP]. Reactions were stopped by the addition of SDS buffer, 
separated by SDS-polyacrylamide gel electrophoresis and kinase activity was 
measured as *’P counts per minute. EspG-ARF6 and EspG-PAK2-Io3 were 
purified by ion exchange and gel filtration chromatography and crystallized by 
the hanging-drop vapour diffusion method. We collected X-ray diffraction data at 
the Structural Biology Center, Advanced Photon Source, Argonne National 
Laboratory (USA). The structure of EspG-ARF6 was phased to a resolution of 
2.5 A by the multiwavelength anomalous dispersion method using selenomethio- 
nine-labelled EspG and ARF6 proteins. The EspG-PAK2 structure was solved to a 
resolution of 2.8 A by the molecular replacement method using the EspG mono- 
mer of the EspG-ARF6 structure as the initial search model. Further details can be 
found in Supplementary Information. 


Full Methods and any associated references are available in the online version of 
the paper at www.nature.com/nature. 
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METHODS 

Plasmids. The espG gene from EHEC O157:H7 was PCR cloned in-frame into 
pEGFP-C2 (Clontech) and pcDNA3.1-mCherry. For bacterial expression, 38 and 
41 amino-acid N-terminal deletions (39-398 and 42-398) of EspG were PCR 
subcloned into pGEX-4T1 (GST-tag) (Amersham), pProEX-HTb (6xHis tag) 
(Novagen) and pET28b-MalE (6xHis tag, MalE-tag) vectors. EspG mutants were 
generated with QuickChange site-directed mutagenesis (Stratagene) following 
manufacturer’s instructions. N-terminal deletions of ARF GTPases (ARF1A17, 
ARF5A17 and ARF6A13) and ARL proteins (ARL1A17 and ARL2A16) were PCR 
subcloned into pGEX-4T1 and pProEX-HTb vectors. Human PAK] construct was 
obtained from Dr Gary Bockoch (TSRI, La Jolla, California), and rabbit PAK2 and 
PAK3 were obtained from Dr Melanie Cobb (UTSW). PCR cloning was used to 
generate variable-length constructs of PAK isoforms in pGEX-4T1 vector. All 
constructs were verified by DNA sequencing. 

Yeast two-hybrid system. The yeast expression vector pLexA encoded a gene with 
an NH2-terminal LexA-binding domain and residues 1-398 of EHEC EspG. Day 
9.5 and 10.5 mouse embryo cDNA library (250 jg) in VP16 were screened using 
the yeast two-hybrid system. 

Protein purification for in vitro assays. Recombinant proteins were produced in 
BL21-DE3 E. coli strains. Protein expression was induced with 0.4 mM IPTG for 
16h at 18°C. Bacterial pellets were lysed in either His buffer (100 mM HEPES, 
pH7.5, 300 mM NaCl) or GST buffer (TBS; 50mM Tris pH7.5, 150mM NaCl, 
2mM DTT) supplemented with protease cocktail (Roche). Proteins were purified 
with nickel agarose (Qiagen) or glutathione Sepharose (Amersham Biosciences) 
following manufacturer’s instructions. Eluted proteins were buffer exchanged into 
TBS using concentration centrifugal columns (Millipore), glycerol was added to 
15% and the proteins were then stored at —80°C. 

In vitro GST pull-downs. Protein interactions were examined through GST pull- 
down assays. Unless otherwise stated, 15 1g of recombinant GST proteins immo- 
bilized to glutathione Sepharose were incubated with 201g of 6xHis- and/or 
MalE-tagged proteins for 1h at 4°C. Samples were washed three times in TBS 
supplemented with 0.5% Triton X-100. Proteins were eluted from beads with 
Laemelli sample buffer and were separated by SDS-polyacrylamide gel electro- 
phoresis and stained with Coomassie blue. For nucleotide loading, ARF1A17 and 
ARF6A13 were incubated in nucleotide loading buffer (40 mM HEPES, 150 mM 
NaCl, 2mM EDTA, 10% glycerol) with 10 11M of either GDP or GTP for 30 min at 
37 °C, and then MgCl, was added to 10 mM and the reaction was transferred to ice 
after 15 min at room temperature (25 °C). 

Kinase assays. To obtain full-length PAK2 kinase, 10-cm dishes with 293A cells 
were transfected with 5 ug rabbit PAK2 cDNA in pEGFP-C2 vector and expressed 
for 48h post-transfection. Cells were broken in lysis buffer (20 mM Tris, pH 7.5, 
150mM NaCl, 5mM EDTA, 0.5% Triton X-100). PAK2 kinase was purified by 
immunoprecipitation with 1:1,000 polyclonal anti-GFP antibody (Clontech) and 
25 ul protein A/G slurry for 1h at 4 °C. Beads were washed twice with lysis buffer 
and twice with kinase buffer (40 mM HEPES, pH 7.5, 10 mM MgCl). MBP (5 1g) 
and 2 |1M activating proteins (that is, EspG or Cdc42) were added to the beads for a 
total volume of 30 ll. The reaction was equilibrated on ice for 30 min. The kinase 
activity was initiated by an addition of 10mM ATP and 5uCi ATPyP**. After 
5 min at room temperature, the reaction was stopped with 30 ul X2 Laemelli 
sample buffer. Contents were separated by SDS-polyacrylamide gel electrophor- 
esis, transferred to nitrocellulose membrane and either analysed by western blot 
(1:5,000 monoclonal anti-GFP) or exposed by autoradiography. Bands were cut 
out and the radioactivity signal measured on a scintillation counter. 

Cell microinjection, transfections and immunofluorescence microscopy. 
Normal rat kidney cells were microinjected with EspG proteins using a semi- 
automatic InjectMan NI2 micromanipulator (Eppendorf). A needle concentration 
of 10nM was calculated to inject between 5,000 and 20,000 copies because we 
microinjected ~5% cell volume, giving a final estimated cellular concentration of 
50 pM ina cell volume of 5,000 tum’. HeLa and HEK293A cells were transfected 
using calcium phosphate. At 16-18 h post-transfection, cells were fixed with 3.7% 
formaldehyde and stained with antibodies for immunofluorescence. In co-trans- 
fection experiments, equal amounts of DNA were used for each sample. Brefeldin 
A treatment was performed by adding 5 ug ml! of brefeldin A to the medium 
before fixation with formaldehyde. As a negative control, ethanol was added to the 
medium. All immunofluorescence images were acquired with a Zeiss LSM 5 Pascal 
confocal microscope. Golgi, endosomes and microtubules were detected using anti- 
GM130 (transduction labs), anti-EEA1 (transduction labs) and anti-o-tubulin 
(Sigma) antibodies, respectively. 

hGH trafficking assay. hGH trafficking assays were performed as described 
previously’. Briefly, HeLa cells (50% confluence) were transfected with 1 jg of 
4xFKBP-hGH (Ariad Pharmaceutical, Inc.; http://www.ariad.com/regulationkits; 
source of material, David Bernstein) and either 0.5 ug eGFP-EspG or pEGFP 
control plasmid with Fugene6 (Roche). Sixteen hours later, the medium was 


replaced with medium containing AP21998 (final concentration, 2 |1M) or vehicle 
control. AP21998 was incubated with the cells for 2h before the supernatant was 
collected. The supernatant was then diluted 100-fold and compared against a hGH 
standard curve (12.5-400 pg ml ') for the quantification of hGH released using an 
hGH enzyme-linked immunosorbent assay (Roche). For no drug controls, 100% 
ethanol (2 pl) was incubated with the cells for 2 h. 

Liposome pull-downs and GAP assays. Liposome preparation: Lipids were pur- 
chased in powder form from Avanti Polar Lipids. Golgi mimetic liposomes were 
created by combining 20 mol% DOGS-NTA with DOPC, DOPE, DOPA, DOPS, 
PI, PLP and PI, ;P, in the molar ratios reported previously”. Total lipid (5 mM) 
was solubilized in chloroform, dried under an anhydrous nitrogen stream and 
further dried in a vacuum desiccator for approximately 5h. Dried lipids were 
hydrated with liposome-binding buffer (20 mM Tris-HCl, pH 7.6, 50 mM NaCl, 
10mM MgCl,) and vigorously vortexed between five freeze-thaw cycles (liquid 
nitrogen and 80°C to ensure appropriate phase transition and dispersion of the 
various lipids), after which liposomes were generated by means of ultrasonication 
in a bath sonicator (Laboratory Supplies Company). Liposomes were collected 
from the supernatant after centrifugation (2,500g for 5 min) and used in sub- 
sequent assays. 

Liposome pull-down assays: Liposomes were prepared as described above. 
ARF1 GTP loading was carried out by incubating purified 6xHis-tagged 
ARFIAN17 in nucleotide exchange buffer (20mM Tris-HCl, pH7.6, 50mM 
NaCl, 5mM EDTA, 10% glycerol, 1 mM DTT) with 100 uM GTP. After incuba- 
tion at 37 °C for 30 min, 10 mM MgCl, was added to stabilize ARF1(GTP). The 
requisite volume of ARF1(GTP), GST PAK (residues 121-136) or EspG was added 
to bring the protein concentration to 3 1M in a 100-1 volume. Liposomes were 
added for a final lipid concentration of 10 1M and reactions proceeded at room 
temperature for 30 min. Samples were subjected to centrifugation at 100,000g in a 
Beckman TLA100.3 rotor and a Beckman TL100 ultracentrifuge at 4°C for 1h. 
Supernatant and pellet were separated and analysed on a 12.5% polyacrylamide gel 
and visualized with Coomassie blue. 

GTP hydrolysis assays: ARF1 was incubated in nucleotide exchange buffer with 

250nM yP[GTP] (MP Biomedical) for 30 min at 37°C, after which 10 mM 
MgCl was added to stabilize ARF1(y**P[GTP]). ARF1(y*"P[GTP]) was incubated 
with 10 1M Golgi mimetic liposomes 5 min before the addition of fivefold molar 
excess rat ARFGAP1 (ref. 30), EspG and EspG(E392R). In the case of hydrolytic 
protection assays, EspG or EspG(E392R) was added 5 min before the addition of 
rat ARFGAP1. Aliquots (5 tl) of the 50-11 reaction were removed at times indi- 
cated, added to 5 ml ice-cold binding buffer (TBS + 10 mM MgCl) and vacuum- 
filtered through nitrocellulose membranes. Membranes were washed three times 
with ice-cold binding buffer and subjected to scintillation counting. Data analysis 
was carried out in GRAPHPAD PRISM 5.0b. 
Crystallization and structure determination. Protein expression and purifica- 
tion: A stable protein fragment of EspG residues 42-398 was identified by limited 
proteolysis and mass spectrometry. CDNA-encoding EHEC O0157:H7 EspG resi- 
dues 42-398 and human ARF6 residues 14-175 were synthesized by PCR and 
ligated into the pPRO-EX-HTb expression vector. The resulting plasmids were 
then transformed into the E. coli strain BL21-DE3. Protein expression was induced 
by 1 mM IPTG overnight at 16 °C and proteins were purified on Ni-NTA agarose, 
concentrated to 10 mg ml ' in TBS buffer with 5% glycerol, snap-frozen in liquid 
nitrogen and stored at —80°C. The Se-Met variant of EspG and ARF6 was 
expressed in methionine auxotrophic E. coli strain B834-DE3 and grown in min- 
imal medium supplemented with natural amino acids and Se-Met. Expression and 
purification were unchanged. EspG-ARF6 complex was formed overnight at 
room temperature in the presence of 1:100 TEV protease to cleave the 6xHis 
tag. The complex was purified by successive anion exchange (Q-HP) and gel 
filtration (Superdex 200 GL) chromatography and concentrated to 7mg ml in 
25mM Tris, pH 7.5, and 50 mM NaCl. For the PAK2 crystal trials, EspG protein 
was expressed and purified identically to that described. However, after the anion 
exchange, EspG was incubated with a fivefold molar excess of PAK2 peptide 
(residues 121-136). The complex was purified by gel filtration as described. 

Crystallization and X-ray diffraction data collection: Crystals of EspG-ARF6 
were grown using the hanging-drop vapour diffusion method from drops con- 
taining 2 ul protein (7 mg ml ') and 1 pl of reservoir solution (0.1 M sodium 
acetate, pH5.0, 2% PEG4000, 5% 2,3-methylpentanediol (MPD)), and equili- 
brated over 500 ul of reservoir solution. Bipyramid-like crystals appeared after 
1d at 20°C and grew to their maximal extent in 2-3 d. Crystals were relatively 
large in all three dimensions (0.3 X 0.6 X 0.3mm’). Cryo-protection was per- 
formed by transferring the crystals to a final solution of 37% MPD, 0.1 M sodium 
acetate, pH 5.0, and 2% PEG4000, increasing in 5% steps of MPD over the course 
of 10 min at 20 °C. Crystals were flash-frozen using liquid nitrogen. Es»>G-ARF6 
crystals had the symmetry of space group P4322 with unit-cell parameters of 
a=b=104.6A and c = 98.3 A, and contained one molecule each of EspG and 
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ARF6 per asymmetric unit. EspG-ARF6 crystals diffracted isotropically to a din 
of 2.50 A when exposed to synchrotron radiation. 

Crystals of EspG-PAK2 were grown using the hanging-drop vapour diffusion 
method from drops containing 1 yl protein (12mg ml’) and 1 ul of reservoir 
solution (0.1M Tris, pH 8.0, 0.25M sodium chloride and 20% PEG4000) and 
equilibrated over 500 pil of reservoir solution. Plate-like crystals appeared after 
2d at 20°C and grew to their maximal extent by 4-5 d. Crystals were large in 
two dimensions (0.2 X 0.5mm?) and relatively thin (0.1 mm). Cryo-protection 
was performed by transferring the crystals to a final solution of 15% ethylene 
glycol, 22% PEG4000, 0.1 M Tris, pH 8.0, and 0.25 M sodium chloride, increasing 
in 5% steps of ethylene glycol over the course of 10 min at 20°C. Crystals were 
flash-frozen using liquid nitrogen. EspG-PAK crystals had the symmetry of space 
group P2,2,2, with unit-cell parameters of a= 86.7 A, b=1046A and 
c= 192.0 A, and contained four molecules of EspG-PAK per asymmetric unit. 
EspG-PAK crystals diffracted to a din of 2.85 A when exposed to synchrotron 
radiation. Data were indexed, integrated and scaled using the HKL-3000 program 
package*!. Data collection statistics are provided in Supplementary Table 1. 

Phase determination and structure refinement: Phases for the EspG-ARF6 
complex were obtained from a three-wavelength anomalous dispersion experi- 
ment using selenomethionyl-substituted protein with data to a din of 2.50 A. 
Fifteen selenium sites were located using the program SHELXD”; this represented 
nine single-occupancy selenium sites and six half-occupancy selenium sites per 
EspG-ARF6 heterodimer. Phases were refined with the program MLPHARE, 
resulting in an overall figure of merit of 0.41 for data between 32.9 and 2.50 A. 
Phases were further improved by density modification with the program DM”, 
resulting in a figure of merit of 0.70. An initial model containing 97% of all EspG 
residues was automatically generated by alternating cycles of the programs 
RESOLVE™ and BUCCANEER™. Inspection of the electron density map revealed 
density for the ARF6 molecule, but the automatic model-building programs were 
unable to build a complete model for this protein. Placement of a model for ARF6 
in the cell was performed by means of molecular replacement in the program 
PHASER® using the GTPyS-bound ARF6 (Protein Data Bank ID, 2J5X) as a 
search model. 

Additional residues for EspG were manually modelled in the program O°. 
Refinement was performed with the data collected at the selenium peak wave- 
length to a resolution of 2.50 A using the program PHENIX* with a random 5% of 
all data set aside for an Rgee calculation. The current model contains one EspG and 
one ARF6 monomer; included are residues 47-395 of EspG, residues 14-174 of 
ARF6, one Mg’*-GTP and 138 water molecules. The Rwork value is 22.5% and the 
Réree Value is 32.4%. The higher-than-average Rg. value is probably due to the 
relative dearth of lattice contacts for the ARF6 molecule, as evidenced by weak 
electron density for the portions of ARF6 that are distal to the EspG-binding site. 
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The density for the portions of ARF6 (residues 20-63 and 152-170) that are 
proximal to EspG and to the Mg**-GTP is strong and well connected. A 
Ramachandran plot generated with MOLPROBITY” indicated that 99.0% of all 
protein residues are in allowed regions. 

Phases for EspG-PAK were obtained by means of molecular replacement in the 
program PHASER using the coordinates of EspG from the EspG-ARF6 structure 
as a search model. Model building and refinement was performed as described 
above, with the following modification: owing to the lower resolution of the data, 
restrained non-crystallographic symmetry was implemented during refinement. 
The current model contains four EspG monomers and four PAK peptides. 
Included are EspG residues 42-158, 163-318 and 321-397 and PAK residues 
122-135, in complex A; EspG residues 43-397 and PAK residues 122-133, in 
complex B; EspG residues 42-158, 163-316 and 322-395 and PAK residues 
123-132, in complex C; and EspG residues 42-158, 163-317 and 320-395 and 
PAK residues 122-134, in complex D. The Ryor, value is 20.3% and the Rg. value 
is 28.6%. A Ramachandran plot generated with MOLPROBITY indicated that 
99.4% of all protein residues are in allowed regions. Phasing and model refinement 
statistics are provided in Supplementary Table 1. 

Combination models were generated by structural alignment of homologous or 
identical proteins from separate independent structures where applicable, using 
PYMOL. 
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Preplay of future place cell sequences by 
hippocampal cellular assemblies 


George Dragoi' & Susumu Tonegawa! 


During spatial exploration, hippocampal neurons show a sequential 
firing pattern in which individual neurons fire specifically at par- 
ticular locations along the animal’s trajectory (place cells’”). 
According to the dominant model of hippocampal cell assembly 
activity, place cell firing order is established for the first time during 
exploration, to encode the spatial experience, and is subsequently 
replayed during rest*° or slow-wave sleep’'° for consolidation of 
the encoded experience'’*. Here we report that temporal sequences 
of firing of place cells expressed during a novel spatial experience 
occurred on a significant number of occasions during the resting or 
sleeping period preceding the experience. This phenomenon, which 
is called preplay, occurred in disjunction with sequences of replay of 
a familiar experience. These results suggest that internal neuronal 
dynamics during resting or sleep organize hippocampal cellular 
assemblies'** into temporal sequences that contribute to the 
encoding of a related novel experience occurring in the future. 

We recorded neuronal firing sequences from the CA1 area of the 
mouse hippocampus (Supplementary Fig. 1) during periods of awake 
rest (Fam-Rest) alternating with periods of running (Fam-Run) on a 
familiar track (Fam session; Supplementary Fig. 2a) that preceded the 
exploration of a novel linear arm in contiguity with the familiar track 
(Contig-Run on L-shaped track; Fig. 1, Supplementary Fig. 2a and 
Methods). All the place cells active on the novel arm during Contig- 
Run, whether previously silent'® (19% in both directions and 31% in at 
least one direction; Methods and Supplementary Tables 1-3) or active 
during Fam-Run (subpanels a in Fig. 1), fired during Fam-Rest at the 
ends of the familiar track (range, 0.17-11.7 Hz; Supplementary Fig. 3) 
as part of a number of ‘spiking events’. The spiking events were defined 
as epochs composed of multiple individual spikes from at least four 
different place cells active on the novel arm or familiar track, separated 
by less than 50 ms and flanked by at least 50 ms of silence**. More 
significantly, the temporal sequence in which the cells active on the 
novel arm fired during Fam-Rest (subpanels b in Fig. 1) was signifi- 
cantly correlated with the spatial sequence in which they fired later as 
place cells on the novel arm during Contig-Run (subpanels c in Fig. 1), 
despite being uncorrelated with their spatial sequence as place cells on 
the familiar track during Fam-Run. This is illustrated as place cell 
sequences during Contig-Run (subpanels c in Fig. 1) and Fam-Run 
(subpanels a in Fig. 1) compared with the firing sequences of these cells 
within individual spiking events observed during Fam-Rest (subpanels 
b in Fig. 1). We refer to this process as ‘preplay’ of place cell sequences 
because the temporal sequence of firing during Fam-Rest had occurred 
before the actual exploration of the novel arm in the subsequent 
Contig-Run and was not a replay of the place cell sequences from 
the previous Fam-Run. 

To quantify the significance of preplay and to compare it with replay, 
we created place cell sequence templates according to the spatial order 
of the peak firing of place cells**"° on the novel arm during Contig-Run 
(novel arm templates; subpanels c in Fig. 1 and Methods) and on the 
familiar track during Fam-Run (familiar track templates) for each run 
direction. The spikes of all the place cells used to construct the two types 


of template that were emitted during Fam-Rest were sorted by time, and 
spiking events were determined as explained above (subpanels b in 
Fig. 1). For each spiking event, we calculated a rank-order correlation 
between the novel arm templates and the temporal sequence of firing of 
the corresponding cells in the spiking events during Fam-Rest. The 
event correlation was considered significant if it exceeded the 97.5th 
percentile of a distribution of correlations resulting from randomly 
shuffling the order of place cells in the novel arm templates 200 times 
(P < 0.025). Forward* and reverse** preplay refers to the cases in which 
the sequence of place cells during Contig-Run and the firing order of 
the corresponding cells in Fam-Rest were in the same and opposite 
directions, respectively. In 91% of the preplay cases, the spiking events 
were correlated with the novel arm template in one direction only. The 
distribution of event correlation values obtained using the original 
novel arm templates was significantly shifted towards higher positive 
or negative values in comparison with the distribution of correlation 
values obtained using shuffled templates (Fig. 2a and Supplementary 
Fig. 4). Figure 2a also shows the distribution of significant preplay 
events (in red). Of all the spiking events detected as above and in which 
at least four novel arm place cells were active, 14.2% were significant 
preplay events for the place cell sequence on the novel arm (P< 10 ~*, 
binomial probability test*) in the forward or reverse order (Fig. 2b). 

The occurrence of significant preplay events was correlated with the 
occurrence of high-frequency ripple oscillations in CA1 (Fig. 2c). The 
majority of the significant preplay events (81.1%; Fig. 2d, total, blue) 
took place at the junction between the familiar and novel arms, and the 
remaining 18.9% took place at the free end of the familiar track 
(Fig. 2d, total, purple). The proportion of significant preplay events 
among the total events at each of the two track ends was higher at the 
junctional end (15.2%, P< 10 *°) than at the free end (8.5%, 
P< 10 *) of the familiar track (P < 0.035, Z-test; Fig. 2d, normalized). 

We found a relatively high correlation between the place field maps 
(Fig. 1A, B and Supplementary Fig. 5) of the familiar track before and 
after the novel experience (median r= 0.66; Fig. 2e, familiar track, 
blue); it was significantly higher than the correlations obtained when 
the cell identities were shuffled (median r = 0.23, P< 10 *; Fig. 2e, 
familiar track, black). A similar correlation analysis showed a relatively 
high stability of the newly formed place fields on the novel arm from 
the beginning to the end of Contig-Run (median r= 0.62 (newly 
formed) versus median r= 0.21 (shuffled), P< 10-?; Fig. 2e, novel 
arm, blue versus grey). These results suggest that preplay of the novel 
arm does not occur over an entirely new (that is, remapped) repre- 
sentation of the whole L-shaped track but rather benefits from the 
relative stability of the familiar track representation across sessions 
and perhaps facilitates the rapid, stable encoding of the novel arm 
experience. 

Using the familiar track templates and spiking events during Fam- 
Rest, constructed as above, we determined that 16.2% (P< 10 1. data 
not shown) were significant replay events**'” among the spiking events 
in which a minimum of four familiar track place cells were active. All 
significant preplay events occurring during Fam-Rest (n = 75) were 
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Figure 1 | Preplay of novel place cell sequences. Fam-Run and Fam-Rest 
respectively denote run and rest sessions on the familiar linear track before 
barrier removal; Contig-Run denotes run sessions on the L-shaped track after 
barrier removal. The L-shape track was linearized for display/analysis. 

A, B, mouse 1; C, D, mouse 2; E, mouse 3. A-E, a, Spatial activity on the familiar 
track during Fam-Run of the cells that had place fields in Contig-Run and 
preplayed during Fam-Rest (one cell per row); activity on the novel arm and 
familiar track are on the same scale. Horizontal arrows indicate run directions. 
Vertical grey bars indicate barrier locations during Fam-Run and Fam-Rest. 
A-E, b, Examples of representative spiking events in the forward or reverse 


tested for possible replay of the familiar track spatial sequence: these 
spiking events were more correlated with the novel arm template 
(Fig. 2f, red) than the familiar track template (Fig. 2f, blue). Seventy- 
two percent (n = 54) of the significant events previously considered to 
be preplay had no significant correlation with the familiar track tem- 
plate. An additional 16% (n = 12) of those events were better correlated 
with the novel arm templates (mean absolute r = 0.92) than with the 
familiar track template (mean absolute r = 0.67, P< 10 °). Together, 
these findings reject the hypothesis that the preplay events simply rep- 
resent a replay of the familiar track activity (see additional controls in 
Supplementary Information). Moreover, we found that the proportion 
of events exclusively composed of silent cells that perfectly matched the 
novel arm spatial templates was 0.67 (16 of 24 triplets), which is sig- 
nificantly greater (P < 0.025) than the proportion of by-chance perfect 
matches (0.33). 

To illustrate the distribution and relative proportions of preplay and 
replay events among all significant spiking events during Fam-Rest, we 
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direction during Fam-Rest in 250-ms time windows (350 ms for the second and 
fourth panels from left in E, b). Tick marks indicate individual spikes: red, 
preplay events for place cell sequences in the novel arm; blue (in A, b and 

B, b), additional spikes from the familiar track place cells participating in the 
spiking event (not shown in C-E, b). Numbers on the left denote cell numbers 
and correspond to the place cell numbers in A-E, a. Square boxes indicate the 
ends of the familiar track where preplay events occurred. Local field potentials 
recorded simultaneously with the spikes are shown above spiking events. 
A-E, c, Place cell sequences in the novel arm (C-E, c; red) or in both the novel 
arm (red) and the familiar arm (blue) (A, c and B, c) in Contig-Run. 


calculated a ‘template specificity index’ (Fig. 2g and Methods) for each 
event. Pure preplay events (Fig. 2g, red) and pure replay events (Fig. 2g, 
blue) were segregated, and only a minority of events were significant 
for both preplay and replay (Fig. 2g, yellow). Consistent with this 
segregation of preplay and replay events, the novel arm and the ‘cor- 
responding familiar track’ templates were not significantly correlated 
(Fig. 2h and Methods). The ratio between the number of pure replay 
events (n = 171) and the number of pure preplay events (n = 54) 
during Fam-Rest was about 3.1 (Fig. 2g, inset; see Supplementary 
Information for proportions of events). Preplay and replay events were 
distributed in time across Fam-Rest (Supplementary Fig. 6a—c) and 
their occurrences were generally uncorrelated (Supplementary Fig. 6d). 
The majority (79.9%) of the spiking events during Fam-Rest did not 
significantly correlate with either of the two templates (data not shown). 

We used a Bayesian reconstruction algorithm”**"*"” (Methods) to 
decode the animals’ position from the spiking activity during Fam- 
Run (Fig. 3a) or Fam-Rest (Fig. 3b, c). For all original and shuffled® 
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Figure 2 | Quantification of the preplay phenomenon and comparison with 
replay. a, Distribution of correlations between spiking events in Fam-Rest and 
spatial templates of the novel arm. Open bars indicate spiking events versus the 
original (unshuffled) templates; filled bars indicate spiking events versus 200 
shuffled templates scaled down 200 times; red bars show the distribution of 
preplay (that is, significant) events. Similar distributions (not shown) of 
corresponding spiking events were obtained when spatial templates were 
constructed using all place cells active on the L-shaped track (Figs 1A, b, c and 
1B, b, c; red and blue). b, Proportion of all, forward and reverse preplay events 
among the spiking events in Fam-Rest. The dotted line indicates the chance 
level (3.2%). ¢, Cross-correlation between preplay events and ripple epochs. 
d, Location of preplay events on the familiar track: total, proportions of preplay 
events at ends of the track; normalized, proportion of preplay events 
normalized by the number of spiking events at each end of track. Preplay events 
represented a trajectory running from the free end of the novel arm to the 
junctional end (40%) or begun near the familiar track (60%); the latter suggests 
that in some cases preplay events could be triggered by the activity of the 
familiar track place cells during Fam-Rest. e, Stability of place cell spatial tuning 
across the novel experience: familiar track, stability of the place fields active on 
the familiar track before (Fam-Run) versus after (Contig-Run) barrier removal; 
novel arm, stability of the place fields active on the novel arm at the beginning 


probability distributions, a line was fitted to the data using a line-finding 
algorithm® to represent the decoded virtual trajectory (Methods 
and Supplementary Information). In 16.36% of cases representing 
trajectories, the reconstructed trajectory during spiking events in Fam- 
Rest was contained within the novel arm (Fig. 3c, top), a place the 
animal had not yet visited (that is, trajectory preplay). Moreover, in 
79.8% of the trajectory preplay cases the shuffling procedures resulted 
in lines that were significantly less or not at all contained within the 
novel arm (that is, not preplay; Supplementary Information). The 
remaining trajectories decoded during Fam-Rest represented replay 
of the familiar track (64.15%; Fig. 3c, middle) or spanned the joint 
familiar track/novel arm space (19.49%; Fig. 3c, bottom). Means of 
absolute rank-order correlations between spiking activity and novel 
arm templates (Fig. 2a) restricted during epochs of trajectory preplay 
were significantly larger than those between spiking activity and familiar 
track templates calculated during the same epochs (0.75 versus 0.59, 
P<10 “). Overall, these results support the existence of the preplay 
phenomenon. 

To investigate the possibility that preplay of novel arm place cell 
sequences during Fam-Rest depends on the prior run experience on 
the familiar track, mice with no prior experience on any linear track 
were placed in a high-walled sleep box and recorded while resting/ 
sleeping. The animals were then transferred to a novel isolated linear 
track that was in the same room but could not be seen from inside the 


(first four laps of run) versus the end (last four laps) of the Contig-Run session. 
Data (blue), within-cell correlation of place cell spatial tuning for the 
corresponding track/arm; shuffle (black), cell identity shuffle (Supplementary 
Information). Error bars, s.e.m.; asterisks in d and e indicate significant 
differences. f, Distribution of preplay event correlations (red) versus 
distribution of these event correlations with the familiar track template (blue). 
Spiking events were detected using all place cells from the familiar track and 
novel arm templates (>1 Hz). Red bars are the same as in a. Correlation is 
strong with the novel arm template (preplay) and weak with the familiar arm 
template (replay). The P value corresponds to there being a significant 
difference between the two distributions. g, Disjunctive distribution of pure 
preplay (red), pure replay (blue) and preplay/replay (yellow) events during 
Fam-Rest over their template specificity index (Supplementary Information). 
Inset, proportions of pure preplay events (red), pure replay events (blue) and 
preplay/replay events (yellow) among all of the spiking events that were 
significantly correlated with at least familiar track templates or novel arm 
templates. h, Lack of correlation between the novel arm template and the 
corresponding familiar track template. Each of the six dots represents either a 
forward or a reverse run direction of one of the three mice analysed. Red 
horizontal line denotes a P value of 0.05. The correlation values were not 
significant in any of the cases (Supplementary Information). 


box, and the recording continued during de novo formation of place 
cells (Supplementary Fig. 2b, de novo session). We found that in a 
relatively large proportion (16.1%) of spiking events identified during 
sleep/rest in the sleep box, the neuronal firing sequences were signifi- 
cantly correlated with the place cell sequences observed during the first 
run session on the novel track (Fig. 4A, B and Methods); this was the 
case for all four individual mice (Supplementary Fig. 7). Preplay events 
were associated with the ripple occurrence (Fig. 4C). The place cells 
established on the novel track in the de novo session were more 
dynamic (median r = 0.42; Fig. 4D, blue) than in Contig-Run (median 
r= 0.62, P< 0.016; Fig. 2e, right, blue). 

We have demonstrated that a significant number of temporal firing 
sequences of CAI cells during resting periods of a familiar track 
exploration that preceded a novel track exploration in the same general 
environment were correlated with the place cell sequences of the novel 
track rather than the familiar track. This phenomenon, preplay, is 
temporally opposite to the process of replay*"°!°°, when activity dur- 
ing rest or sleep periods recapitulates place cell sequences that have 
already occurred during previous explorations. Preplay differs fun- 
damentally from replay because it occurs before exploration of novel 
tracks. 

Although our recordings were carried out in CA1, we believe that 
what we observed could be a reflection of the output of the recurrent 
cellular assemblies from upstream regions (CA3 or entorhinal cortex). 
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Figure 3 | Bayesian reconstruction of the animal’s trajectory in the familiar 
track (replay) and novel arm (preplay). a, Position reconstruction of a one-lap 
run on the familiar track from the ensemble place cell activity during Fam-Run. 
The heat map displays the reconstructed position of the animal using ensemble 
place cell activity during the run (250-ms bins; animal velocity, >5 cms‘). 
The yellow line indicates the actual trajectory of the animal during Fam-Run. 
b, Example of virtual trajectory reconstruction (familiar track and novel arm) 
from the ensemble place cell activity during Fam-Rest at the ends of the familiar 
track (20-ms bins; animal velocity, <5 cm s *) before barrier removal and 
novel arm exploration. The yellow line reflects the spatial location of the animal 
in time: the animal was immobile at the junction end of the familiar track. The 
time-compressed (~5 ms_') trajectory reconstruction often ‘jumps’ over the 
barrier (top of the figure) into the novel arm area. At around 0.5s, a preplay of 
the novel arm initiated from the distal (free) end of the novel arm ‘propagates’ 
towards the location of the animal. c, Examples of preplay of the novel arm 
(top), replay of the familiar track (middle) and preplay of the novel arm 
together with replay of the familiar track (bottom) during Fam-Rest. All 
conditions are the same as in b. The white line shows the linear fit maximizing 
the likelihood along the virtual trajectory. Colour bars indicate probability of 
trajectory reconstruction. 


During running on a familiar track, some of the cells in the postulated 
upstream cellular assemblies fire sequentially at spatial locations while 
others, although connected anatomically to these cells, remain silent. 
The lack of expression of preplay sequences during Fam-Run may 
reflect their state-dependent suppression or subthreshold activation 
during these exploratory behaviours. Owing to increased net excitation 
during rest periods predominantly during ripples*’, some of these 
silent cells together with some of the familiar track cells are activated 
above threshold and fire in a certain sequence. Their sequence of 
activation may be determined in part by their functional connectivity 
within the hippocampal formation network. Some of these sequences 
may in turn be activated on a novel track as place cell sequences 
(Supplementary Fig. 8). The activation of the novel place cell sequences 
during running may strengthen their pre-existing assembly organiza- 
tion manifested during preplay. 

It could be argued that during Contig-Run the animals simply con- 
sidered the novel arm to be an extension of the familiar arm and, thus, 
what we considered to be preplay events were replays of the previous 
runs on the familiar track. If this was the case, preplay events would not 
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Figure 4 | Preplay of novel place cell sequences before any linear track 
experience. A, Sleep/rest session in the sleep box (Pre-Run sleep/rest) before 
the first run session on a linear track (De novo-Run). Display format is the same 
as in Fig. 1. A, a, Representative spiking events in the forward or reverse order 
during Pre-Run sleep/rest in 400-ms time windows. A, b, Place cell sequences 
on the novel track (red) during the De novo-Run session. Each row represents 
one cell in which the activity was normalized to the maximum firing rate. One 
run direction in one animal is shown. The median number of place cells active 
on the novel track participating in preplay events is six. B, Distribution of 
spiking events in Pre-Run sleep/rest as a function of the rank-order correlation 
with the place cell sequence template of the novel track. Display format is the 
same as in Fig. 2a. C, Cross-correlation between preplay events and ripple 
epochs during Pre-Run sleep/rest. D, Stability of place cell spatial tuning across 
the novel track experience. Display format is the same as in Fig. 2e (novel arm). 
Error bars, s.e.m.; asterisk indicates significant difference. 


be expected to be found when the experience of the familiar track run is 
eliminated. This idea was refuted by the demonstration of frequent 
preplay events in the sleep box before the mice were transferred onto a 
novel linear track (de novo condition). Under this condition, the place 
cell sequences were more dynamic and a higher proportion of all 
spiking events correlated with the place cell sequences in these runs 
than in the later runs on novel linear tracks. These results suggest a 
shift in the relative contribution of internal*””’ and external drives in 
the formation of place cell sequences on encounter with a novel track. 
In the early phase, internal drives originating in the dynamic cellular 
assembly activities, which probably reflect numerous past experiences 
distinct from the current one and expressed as preplay, may have a 
greater role, whereas in the late phase, external drives that come from 
the specific set of stimuli of the current experience may dominate. 
Thus, place cell sequences on novel tracks seem to be products of a 
dynamic interplay between the internal and external drives. 

Several previous studies did not reveal preplay’*’°”. Although it is 
difficult to pinpoint the apparent discrepancies between these studies 
and the present one, we suggest that the use of insufficiently sensitive 
methods (pairwise correlations) by some studies’**° and small sample 
sizes by others’® might have precluded detection of preplay in previous 
work (see Supplementary Information for details). Data from the de 
novo condition (Fig. 4), in which we observed an even higher propor- 
tion of preplay events, have not been reported previously. 

Our data showed that novel preplay events coexist in disjunction 
with familiar replay events during the rest periods on the familiar 
track. This and the finding that these preplay and replay events 
together make up fewer than one-quarter of all detected spiking events 
suggest that they are part of a dynamic repertoire of temporal 
sequences in the hippocampus that are past-experience dependent 
(replay) or future-experience expectant™ (preplay). Post-experience 
replay of place cell sequences during resting*° or slow-wave sleep* '° 
has been proposed to have an important role in memory consolida- 
tion"’’*. The temporal preplay of new place cell sequences during 
resting or sleep is consistent with a predictive function for the hippo- 
campal formation® and may contribute to accelerating learning”® 
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when a new experience is introduced in multiple steps of increasing 
novelty. 


METHODS SUMMARY 


We recorded place cells from the CA1 area of the hippocampus with six indepen- 
dently movable tetrodes in four mice during sleep/rest sessions in the sleep box 
before any experience on linear tracks and during the first run session on a novel 
track. Following familiarization with the linear track, animals were subsequently 
allowed to explore a continuous (L-shaped) track in which the now familiar track 
and a new novel arm were made contiguous. To quantify the significance of the 
preplay and replay processes, spiking events in which at least four cells were active 
were detected during sleep/rest (speed, <1 cms ') periods in the sleep box or 
awake rest (speed, <2.cms_') periods at the ends of the familiar track and novel 
arm, predominantly during ripple epochs. 

We calculated statistical significance at the P<0.025 level for each event by 
comparing the rank-order correlation between the event sequence and the place 
cell sequence (template) with the distribution of correlation values from 200 tem- 
plates obtained by shuffling the original order of the place cells. Proportions of 
significant events were calculated as the ratio between the number of significant 
events and the total number of spiking events. We calculated the overall significance 
of preplay or replay processes by comparing the distribution of correlation values of 
all events with the distribution of correlation values of shuffled templates 
(Kolmogorov-Smirnov test). The significance of the proportion of significant events 
out of the total number of spiking events was determined as the binomial probability 
of observing the number of significant events (as successes) from the total number of 
spiking events (as independent trials), with a probability of success of 0.025 in any 
given trial. We reconstructed the position of the animal from the spiking activity 
emitted during resting periods using Bayesian decoding procedures’. 


Full Methods and any associated references are available in the online version of 
the paper at www.nature.com/nature. 


Received 4 December 2009; accepted 29 October 2010. 
Published online 22 December 2010. 


1. O'Keefe, J. & Nadel, L. The Hippocampus as a Cognitive Map (Oxford Univ. Press, 
1978). 

2. Wilson, M. A. & McNaughton, B. L. Dynamics of the hippocampal ensemble code 
for space. Science 261, 1055-1058 (1993). 

3. Foster, D. J. & Wilson, M. A. Reverse replay of behavioural sequences in 

hippocampal place cells during the awake state. Nature 440, 680-683 (2006). 

4. Diba, K. & Buzsaki, G. Forward and reverse hippocampal place-cell sequences 

during ripples. Nature Neurosci. 10, 1241-1242 (2007). 

5. Karlsson, M. P. & Frank, L. M. Awake replay of remote experiences in the 

hippocampus. Nature Neurosci. 12, 913-918 (2009). 

6. Davidson, T. J., Kloosterman, F. & Wilson, M. A. Hippocampal replay of extended 

experience. Neuron 63, 497-507 (2009). 

7. Wilson, M. A. & McNaughton, B. L. Reactivation of hippocampal ensemble 

memories during sleep. Science 265, 676-679 (1994). 

8. Skaggs, W. E. & McNaughton, B. L. Replay of neuronal firing sequences in rat 

hippocampus during sleep following spatial experience. Science 271, 1870-1873 

(1996). 


LETTER 


9. Nadasdy, Z., Hirase, H., Czurko, A., Csicsvari, J. & Buzsaki, G. Replay and time 
compression of recurring spike sequences in the hippocampus. J. Neurosci. 19, 
9497-9507 (1999). 

10. Lee, A. K. & Wilson, M. A. Memory of sequential experience in the hippocampus 
during slow wave sleep. Neuron 36, 1183-1194 (2002). 

11. Buzsaki, G. Two-stage model of memory trace formation: a role for “noisy” brain 
states. Neuroscience 31, 551-570 (1989). 

12. Nakashiba, T., Buhl, D. L., McHugh, T. J. & Tonegawa, S. Hippocampal CA3 output is 
crucial for ripple-associated reactivation and consolidation of memory. Neuron 62, 
781-787 (2009). 

13. Hebb, D. O. The Organization of Behavior: A Neuropsychological Theory (Wiley, 
1949). 

14. Harris, K. D., Csicsvari, J., Hirase, H., Dragoi, G. & Buzsaki, G. Organization of cell 
assemblies in the hippocampus. Nature 424, 552-556 (2003). 

15. Dragoi, G. & Buzsaki, G. Temporal encoding of place sequences by hippocampal 
cell assemblies. Neuron 50, 145-157 (2006). 

16. Thompson, L. T. & Best, P. J. Place cells and silent cells in the hippocampus of 
freely-behaving rats. J. Neurosci. 9, 2382-2390 (1989). 

17. O'Neill, J., Senior, T. & Csicsvari, J. Place-selective firing of CAl pyramidal cells 
during sharp wave/ripple network patterns in exploratory behavior. Neuron 49, 
143-155 (2006). 

18. Zhang, K., Ginzburg, |., McNaughton, B. L. & Sejnowski, T. J. Interpreting neuronal 
population activity by reconstruction: unified framework with application to 
hippocampal place cells. J. Neurophysiol. 79, 1017-1044 (1998). 

19. Johnson, A. & Redish, A. D. Neural ensembles in CA3 transiently encode paths 
forward of the animal at a decision point. J. Neurosci. 27, 12176-12189 (2007). 

20. Kudrimoti, H. S., Barnes, C. A. & McNaughton, B. L. Reactivation of hippocampal 
cell assemblies: effects of behavioral state, experience, and EEG dynamics. J. 
Neurosci. 19, 4090-4101 (1999). 

21. Csicsvari, J., Hirase, H., Czurko, A., Mamiya, A. & Buzsaki, G. Oscillatory coupling of 
hippocampal pyramidal cells and interneurons in the behaving rat. J. Neurosci. 19, 
274-287 (1999). 

22. Dragoi, G., Harris, K. D. & Buzsaki, G. Place representation within hippocampal 
networks is modified by long-term potentiation. Neuron 39, 843-853 (2003). 

23. Pastalkova, E., Itskov, V., Amarasingham, A. & Buzsaki, G. Internally generated cell 
assembly sequences in the rat hippocampus. Science 321, 1322-1327 (2008). 

24. Black, J. E. & Greenough, W. T. Advances in Developmental Psychology (Lawrence 
Erlbaum, 1986). 

25. Hassabis, D., Kumaran, D., Vann, S. D. & Maguire, E. A. Patients with hippocampal 
amnesia cannot imagine new experiences. Proc. Nat! Acad. Sci. USA 104, 
1726-1731 (2007). 

26. Tse, D. et al. Schemas and memory consolidation. Science 316, 76-82 (2007). 


Supplementary Information is linked to the online version of the paper at 
www.nature.com/nature. 


Acknowledgements We thank M. A. Wilson for assistance with data acquisition, 
discussions and comments on an earlier version of the manuscript; J. O’Keefe, 

A. Siapas, F. Kloosterman, D. L. Buhl for comments on earlier versions of the 
manuscript; and F. Kloosterman for providing assistance with the line detection for the 
Bayesian decoding. This work was supported by NIH grants RO1-MHO78821 and 
P50-MH58880 to S.T., who was an HHMI Investigator in an earlier part of this study. 


Author Contributions S.T. and G.D. conceived the project jointly. G.D. designed and 
performed the experiments and the analyses. G.D. and S.T. wrote the paper. 


Author Information Reprints and permissions information is available at 
www.nature.com/reprints. The authors declare no competing financial interests. 
Readers are welcome to comment on the online version of this article at 
www.nature.com/nature. Correspondence and requests for materials should be 
addressed to G.D. (gdragoi@mit.edu) or S.T. (tonegawa@mit.edu). 


00 MONTH 2010 | VOL 000 | NATURE | 5 


©2010 Macmillan Publishers Limited. All rights reserved 


LETTER 


METHODS 

Surgery and experimental design. Electrophysiological recordings were per- 
formed on four C57BL/6 mice (strain NR1-floxed’’) with ages between 18 and 
22 weeks. All animals were implanted under Avertin anaesthesia with six inde- 
pendently movable tetrodes aiming for the CA1 area of the right hippocampus 
(1.5-2 mm posterior to bregma and 1-2 mm lateral to the midline; Supplementary 
Fig. 1). The reference electrode was implanted posterior to lambda over the cere- 
bellum. During the following week of recovery, the electrodes were advanced daily 
while animals rested in a small, walled sleeping box (12 x 20 cm?, 35cm high). The 
animal position was monitored by means of two infrared diodes attached to the 
headstage. 

The experimental apparatus consisted of a 90 X 65cm” rectangular, walled, 
linear track maze. All tracks were 4cm wide at the bottom and 8-9 cm wide at 
the top, and all linear track walls were 10cm high. Experimental sessions were 
conducted while the animals explored for chocolate sprinkle rewards placed 
always at the ends of the corresponding linear tracks (one sprinkle at each end 
of the track on each lap). Neuronal activity was recorded in naive animals (four 
mice) during the sleep/rest session in the sleep box immediately preceding the first 
experience on linear tracks, and continued (Fig. 4) during the first run session on a 
novel track. After familiarization with the linear track, the animals went through a 
recording session of 15-60 min (Fam session), and the recordings continued for 
the next 34-42 min (Contig session) while the animals explored an L-shaped track 
for the first time. In this track, the familiar arm and the novel arm were made 
contiguous by removing the barrier that had separated them (Fig. 1). For the 
purpose of analysing the recording data, the Fam session was further divided into 
Fam-Run, in which the animals ran through the track (velocity of animal’s move- 
ment was higher than 5 cms ~ 1) and Fam-Rest, where the animals took awake rests 
at the ends of the track (velocity of animal’s movement was less than 2 cm s'). 
During resting periods, the animals consumed the chocolate sprinkle and 
groomed, but mostly they were still until they self-initiated the next lap of run 
on the linear track. After completion of the experiments, the brains of all mice were 
perfused, fixed, sectioned and stained using nuclear fast red (Supplementary Fig. 1) 
or cresyl violet for electrode track reconstruction. 

Recordings and single-unit analysis. A total of 87 neurons were recorded from 
the CA1 area of the hippocampus in four mice during the Fam and Contig sessions 
(Supplementary Tables 1-3). A total of 69 CA1 neurons were recorded from the 
four mice in the de novo condition (26, 20, 10 and 13 cells, respectively). Single cells 
were identified and isolated using the manual clustering method Xclust* and the 
application of cluster quality measurements”*. Pyramidal cells were distinguished 
from interneurons on the basis of spike width, average rate and autocorrelations™. 

Place fields were computed as the ratio between the number of spikes and the 
time spent in 2-cm bins along the track, smoothed with a Gaussian kernel with a 
standard deviation of 2 cm. Bins where the animal spent a total of less than 0.1s 
and periods during which the animal’s velocity was below 5 cms | were excluded. 
Place field length and peak rate were calculated after separating the direction of 
movement and linearizing the trajectory of the animal. Linearized place fields were 
defined as areas with a localized increase in firing rate above 1 Hz for at least five 
contiguous bins (10 cm). The place field peak rate and location were given by the 
rate and location of the bin with the highest ratio between spike counts and time 
spent. Place field borders were defined as the points where the firing rate became 
less than 10% of the peak firing rate or 1 Hz (whichever was bigger) for at least 
2cm. 

Local field potential analysis. Ripple oscillations were detected during sleep/rest 
periods in the sleep box and during rest periods at the ends of the tracks. The 
electroencephalography signal was filtered (120-200 Hz) and ripple-band ampli- 
tude was computed using the Hilbert transform. Ripple epochs with maximal 
amplitude more than 5 s.d. above the mean, beginning and ending at 1 s.d. were 
detected. The time of ripple occurrence (Figs 2c and 4C) was the time of its 
maximal amplitude. The proportion of ripples with which cells with place fields 
on the novel arm of the L-shaped track fired in the preceding session 
(Supplementary Fig. 3) was calculated for each qualifying cell as the ratio between 
the number of ripples during which the cell fired at least one spike and the total 
number of ripples during the corresponding exploratory session. 

Preplay and replay analyses. To analyse the preplay and replay processes, spiking 
events were detected during Pre-Run sleep/rest periods in the sleep box (de novo 
condition; velocity, <1cms ') or during awake rest periods at the ends of the 
running tracks (Contig condition; velocity, <2cms~'). A spiking event was 
defined as a transient increase in the firing activity of a population of at least four 
different place cells within a temporal window preceded and followed by at least 
50 ms of silence. Overall, similar results were obtained using 50-, 60-,75- and 100-ms 
time windows. The spikes of all the place cells active on the novel track that were 
emitted during the Pre-Run sleep/rest in the box for the de novo condition as well as 
the spikes of all the place cells active on the familiar track or the novel arm that were 


emitted during Fam-Rest session at the two ends of the familiar track for Contig 
condition were respectively sorted by time and further used for the detection of the 
spiking events. 

All four animals exhibited a significant number of spiking events in the Pre-Run 
session of the de novo condition. Three of the four animals (mice 1-3) exhibited a 
significant number of spiking events in the Contig condition, the remaining animal 
(mouse 4) having a below-threshold number of simultaneously active CA1 place 
cells. The time of the spiking event used to compute the cross-correlation with 
ripple epoch occurrence (Figs 2c and 4C) was the average time of all spikes com- 
prising the spiking event. The place cell sequences (templates) were calculated for 
each direction of the animal’s movement and for each run session (De novo-Run, 
Fam-Run and Contig-Run) by ordering the spatial location of the place field peaks 
that were above 1 Hz. For place cells with multiple place fields above 1 Hz on a 
particular arm or track in the Contig condition (six of 52 place cells active on the 
novel arm in the two directions, or 12%: two for each direction in mouse 1, one in 
mouse 2 and one in mouse 3), only the place field corresponding to the peak firing 
rate of the place cell on that arm or track was considered for the construction of the 
template of that particular arm or track, to be consistent with all the previous 
studies that used spatial templates to demonstrate replay during sleep or awake 
rest**°, Place cells with fields on both the novel arm in the Contig-Run session and 
the familiar track in the Fam-Run session participated in the construction of both 
the novel arm and familiar track templates. 

Statistical significance was calculated for each event by comparing the rank- 
order correlation between the sequence of cells’ firing during the event (that is, 
event sequence) and the place cell sequence (template), on the one hand, and the 
distribution of correlation values between the event sequence and 200 surrogate 
templates obtained by shuffling the order of place cells, on the other* (Fig. 2a). The 
significance level was set at 0.025 to control for multiple comparisons (two direc- 
tions of run). The proportions of significant events (preplay novel track, preplay 
novel arm (Fig. 2b), replay novel arm and replay familiar track) were each calcu- 
lated as the ratio between the number of significant events and the total number of 
spiking events in which at least four corresponding place cells were active’. 
Corresponding familiar track templates (Fig. 2h) were constructed by ordering 
the location of peak firing on the familiar track during Fam-Run (no minimum 
threshold of firing) of all place cells that subsequently fired on the novel arm. Cells 
comprising the corresponding familiar track templates are the same as those 
comprising the novel arm templates. We note that these corresponding familiar 
track templates are different from the ones used in Figs 1 and 2a-g, which were 
constructed by ordering the peak firing of all place cells active on the familiar track 
>1Hz. 

The overall significance of the preplay (Fig. 2a) or replay process was calculated 
by comparing the distribution of correlation values of all events relative to the 
original template with the distribution of correlation values relative to the shuffled 
surrogate templates, using the Kolmogorov-Smirnov test*. Quantification of the 
replay versus preplay events during the Fam-Run session (Fig. 2f, g) was per- 
formed as described above using different spatial templates for the familiar track 
and the novel arm. All spiking events were correlated with both the novel arm and 
the familiar track templates. Events significantly correlated only with familiar 
track or with novel arm templates were considered pure replay and pure preplay, 
respectively. The template specificity index was calculated for each event as the 
difference between the absolute value of the event’s correlation with the novel arm 
template (preplay, high positive index) and the event’s correlation with the famil- 
iar track template (replay, high negative index). For the purpose of displaying the 
template specificity index, events correlated with the novel arm but not with the 
familiar track templates were considered preplay and events correlated with the 
familiar track but not with the novel arm templates were considered replay 
(Fig. 2g). Additionally, events correlated with both the familiar track and the novel 
arm templates formed a third group, preplay/replay events, displayed in yellow in 
the inset of Fig. 2g. 

Correlations between pairs of familiar track and novel arm templates (Fig. 2h) 
were performed using modified familiar track templates that were constructed 
using the location of peak firing (>0 Hz) of only those cells that had place fields on 
the novel arm (peak rate, >1 Hz). The lack of significant correlation in this case 
demonstrates that the novel arm place cell sequence is not simply a transposition 
of a familiar track place cell sequence on the novel arm. 

We also identified neurons that did not fire during Fam-Run, that activated 
during Fam-Rest events and that corresponded to trajectories on the novel arm 
during Contig-Run (silent cells). We calculated the correlation between the order 
in which they fired during Fam-Rest events and their spatial sequence as new place 
cells on the novel arm during Contig-Run, as previously explained. Owing to the 
low absolute number of silent neurons, only triplets of cells were available for 
further analysis (n = 24). The proportion of events perfectly matching the spatial 
template was compared with the proportion of by-chance perfect matching (0.33). 
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Stability of place cell maps. Stabilities of place cell firing on the familiar track 
before and after barrier removal as well as on the novel track (de novo condition) 
and the novel arm (Contig condition) in the beginning versus the end of the run 
session were assessed by calculating, for each place cell and each direction, a 
correlation between the spatial firing in the corresponding paired situations 
(before versus after barrier removal for the familiar track or the first four laps 
versus the last four laps of the De novo-Run or Contig-Run session for the novel 
track or arm, respectively). The place cell activity was not partitioned in place 
fields; rather, the whole activity on the particular track or arm was considered 
separately for each cell and direction (average correlations are shown in Figs 2e 
and 4D, blue bars). In addition, we calculated the same type of correlation after 
shuffling the identity of the cell in one member of the correlation (once for each 
different cell; average correlations are in Figs 2e and 4D, black bars). Shuffle results 
(Figs 2e and 4D, black bars) were computed as correlation between spatial tuning 
of cells on the familiar track during Fam-Run and spatial tuning of all other 
simultaneously recorded cells on the familiar arm during Contig-Run (familiar 
track group; Fig. 2e, left), or correlation between spatial tuning of cells on the novel 
arm (or novel track) during the beginning of Contig-Run (or De novo-Run) and 
spatial tuning of all the other simultaneously recorded cells on the novel arm (or 
novel track) during the end of Contig-Run (novel arm group; Fig. 2e, right) or De 
novo-Run (Fig. 4D). Original and shuffled correlations were compared using the 
rank-sum test. The average number of laps (traversal of the novel track in both 
directions) per session was 20.5 in De novo-Run (21, 16, 27 and 18 in the four mice) 
and 16.3 in Contig-Run (13, 14 and 22 in the three mice). 

Bayesian reconstruction of actual and virtual trajectories. For each cell, we 
calculated a linearized spatial tuning curve on the familiar track during the 
Fam-Run session and a linearized spatial tuning curve on the novel arm during 
the Contig-Run session. The tuning curves were constructed in 2-cm bins from 
spikes emitted in both run directions at velocities higher than 5cms~ 1 and were 
smoothed with a Gaussian kernel with a standard deviation of 2cm. We con- 
structed a joint spatial tuning curve for each cell by juxtaposing the spatial tuning 
curve on the familiar track during the Fam-Run session and the spatial tuning 
curve on the novel arm during the Contig-Run session. We also detected for each 
cell all the spiking activity emitted at velocities below 5cms | during the Fam- 
Rest session, where replay and preplay events where shown to occur using the 
rank-order correlation method. We used a Bayesian reconstruction algorithm®'* 
to decode the virtual position of the animal from the spiking activity during Fam- 
Rest (Fig. 3b) in non-overlapping, 20-ms bins using the joint spatial tuning curves. 
We then extracted epochs of reconstructed trajectory matching the time of the 
spiking events as detected using multiunit activity of place cells from the familiar 
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track and novel arm (rank-order correlation method; see ‘Preplay and replay 
analyses’, above). 

We used two shuffling procedures to measure the quality of the Bayesian 
decoding. In the first shuffling procedure, for each event, the original time-bin 
columns of the probability distribution function (PDF) were replaced with an 
equal number of time-bin columns randomly extracted from a pool containing 
the time-bin columns of all PDFs of all detected events®. The shuffling procedure 
was repeated 500 times. In the second shuffling procedure, the identity of the place 
cells was randomly shuffled 100 times and new PDFs were calculated for all events. 
For all original and shuffled PDFs, a line was fitted to the data using a previously 
described line-finding algorithm’. Lines fitted to the original and shuffled data 
were compared using slope, spatial extent, location on the track and probability 
score. We defined replay and preplay as the epochs of Fam-Rest in which the 
reconstructed trajectory was located on the familiar track or the novel arm, 
respectively. The trajectory was defined across a set of position estimates during 
the corresponding epoch (Fig. 3c). Only epochs that lasted at least 60 ms (three 
bins) and which contained reconstructed trajectories spanning at least 10 cm were 
considered for further analysis. Trajectories for which 75% or more of their length 
was located on the familiar track were considered to represent replay of an animal’s 
trajectory on the familiar track (Fig. 3c, middle), and trajectories for which 75% or 
more of their length was located in the novel arm were considered to represent 
preplay of the animal’s future trajectory on the novel arm (Fig. 3c, top). The 
remaining events were considered preplay-replay (Fig. 3c, bottom). 

An epoch was considered significant if the new line was less than 75% contained 
in the familiar track for replay or novel arm for preplay in at least 95% of the 
shuffled cases. For each epoch that was significant for replay or preplay using the 
reconstruction method, we retrieved the value of the rank-order correlation 
between the neuronal firing sequences and the familiar track and novel arm spatial 
templates as calculated using the rank-order correlation method. We compared 
the absolute correlation values between the epoch’s firing sequences and familiar 
track templates with the absolute correlation values between the same epoch’s 
firing sequences and novel arm templates. We also reconstructed the trajectory of 
the animal on the familiar track from the spiking activity during the Fam-Run 
session at velocities above 5 cms‘ in 250-ms bins using the spatial tuning curves 
on the familiar track*’* (Fig. 3a) to validate the decoding procedure. 
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Shell structure and magic numbers in atomic nuclei were generally 
explained by pioneering work’ that introduced a strong spin-orbit 
interaction to the nuclear shell model potential. However, know- 
ledge of nuclear forces and the mechanisms governing the struc- 
ture of nuclei, in particular far from stability, is still incomplete. In 
nuclei with equal neutron and proton numbers (N= Z), enhanced 
correlations arise between neutrons and protons (two distinct 
types of fermions) that occupy orbitals with the same quantum 
numbers. Such correlations have been predicted to favour an 
unusual type of nuclear superfluidity, termed isoscalar neutron- 
proton pairing” °, in addition to normal isovector pairing. Despite 
many experimental efforts, these predictions have not been con- 
firmed. Here we report the experimental observation of excited 
states in the N= Z= 46 nucleus **Pd. Gamma rays emitted follow- 
ing the **Ni(*°Ar,2n)”’Pd fusion-evaporation reaction were iden- 
tified using a combination of state-of-the-art high-resolution 
y-ray, charged-particle and neutron detector systems. Our results 
reveal evidence for a spin-aligned, isoscalar neutron-proton coup- 
ling scheme, different from the previous prediction” °. We suggest 
that this coupling scheme replaces normal superfluidity (charac- 
terized by seniority coupling’”*®) in the ground and low-lying 
excited states of the heaviest N= Z nuclei. Such strong, isoscalar 
neutron-proton correlations would have a considerable impact on 
the nuclear level structure and possibly influence the dynamics of 
rapid proton capture in stellar nucleosynthesis. 

For all known nuclei, including those residing along the N = Z line 
up to around mass 80, a detailed analysis of properties such as binding 
energies’ and the spectroscopy of excited states’ strongly suggests that 
normal isovector (isospin T = 1, see Fig. 1) pairing is dominant at low 
excitation energies. On the other hand, there are long-standing pre- 
dictions’® for a change in the heavier N = Z nuclei, from a nuclear 
superfluid dominated by isovector pairing to a structure where iso- 
scalar (T = 0) neutron-proton (np) pairing has a major influence, as 
the mass number increases towards the exotic doubly magic nucleus 
100¢n, the heaviest N = Z nucleus predicted to be bound. 

Nuclei with N = Zand mass number >90 can only be produced in the 
laboratory with very low cross-sections. The related problems of iden- 
tifying and distinguishing such reaction products and their associated 


y-rays from the vast array of N> Z nuclei that are present in much 
greater numbers from the reactions used have prevented observation 
of their low-lying excited states until now. In the present work, the 
experimental difficulties have been overcome through the use of a highly 
efficient detector system and a prolonged experimental running period. 

Excited states in °*Pd were populated following heavy-ion fusion- 
evaporation reactions at GANIL (Grand Accélérateur National d’Ions 
Lourds), France. 3° Ar ions, accelerated to a kinetic energy of 111 MeV, 
were used to bombard an isotopically enriched (99.83%) **Ni target. 
Light charged particles (mainly protons and «-particles), neutrons and 
y-rays emitted in the reactions were detected in coincidence. A schem- 
atic layout of the experimental set-up is shown in Fig. 2. 

The two-neutron (2n) evaporation reaction channel following 
formation of the °“Pd compound nucleus, leading to °2Pd, was very 
weakly populated, with a relative yield of less than 10 ° of the total 
fusion cross-section. Gamma-rays from decays of excited states in °*Pd 
were identified by comparing y-ray spectra in coincidence with two 
emitted neutrons and no charged particles with y-ray spectra in coin- 
cidence with other combinations of neutrons and charged particles. 
The typical efficiency for detecting any charged particle was 66%. This 
number rises to 88% or higher if more than one such particle is emitted 
in a particular reaction channel. The clean identification of neutrons is 
crucial, as scattering of neutrons from one detector segment to another 
can be misinterpreted as two neutrons; this would give rise to a back- 
ground from the much more prolific reaction channels (where only 
one neutron has been emitted) in y-ray spectra gated by two neutrons. 
But because neutrons have a finite velocity, the difference in detection 

bT=0,J>0 
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Figure 1 | Schematic illustration of the two possible pairing schemes in 
nuclei. a, The normal isospin T = 1 triplet. The two like-particle pairing 
components are responsible for most known effects of nuclear superfluidity. 
Within a given shell these isovector components are restricted to spin zero 


owing to the Pauli principle. b, Isoscalar T = 0 neutron-proton pairing. Here 
the Pauli principle allows only non-zero components of angular momentum. 
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36Ar beam 


Figure 2 | Schematic illustration of the experimental set-up used to identify 
y-ray transitions from excited states in °*Pd. The light particles and y-rays 
emitted from the *°Ar + °8Ni reaction were observed using three different 
detector systems. The innermost detector array, DIAMANT'®”’ (green), which 
consisted of 80 CsI scintillators, was used to detect light charged particles, 
mainly o-particles and protons, and acted as a veto detector in the selection of 
events with no charged particles emitted. The Neutron Wall’* (orange), 
comprising 50 liquid scintillator detectors and covering a solid angle of 17 in 
the forward direction, was used for the detection of evaporated neutrons. It is 
able to discriminate between neutron and y-ray interactions by means of a 
combined time-of-flight and pulse-shape analysis technique. Gamma-rays 
emitted from the reaction products were detected using the EXOGAM'”° 
high-purity Ge detector system (blue). Seven segmented Ge clover detectors 
were placed at an angle of 90° and four detectors at an angle of 135° relative to 
the beam direction, leaving room for the Neutron Wall at forward angles. 


time is typically smaller for interactions resulting from two separate 
neutrons compared to a single scattered neutron. Background contri- 
butions from neutron scattering in 2n-gated spectra were significantly 
reduced by applying a criterion to the time difference in the time-of- 
flight parameter, relative to the distance between the neutron detectors 
firing. After such corrections, the efficiency for correctly identifying 
both neutrons from a 2n-event was 3%. Figure 3a—c shows projected 
y-ray spectra from the charged particle-vetoed, 2n-selected E,-E,, 
coincidence matrix when y-rays coincident with the 874keV, 
912 keV and 750 keV transitions (Fig. 3a—c, respectively) assigned to 
*°Pd are selected. By comparing spectra with and without the charged 
particle veto condition applied, it is clear that these y-rays are not 
associated with emission of charged particles from the compound 
nucleus. Figure 3d shows a plot of the intensity ratios of the 
874 keV, 912keV and 750keV y-rays (filled circles) in coincidence 
with two neutrons and one neutron, respectively, proving that the 
y-rays assigned to °’Pd belong to the 2n-evaporation reaction channel. 
An extensive literature search was also performed in order to exclude 
the possibility that the y-rays assigned to °**Pd could be due to the 
decay of excited states in some other nucleus. In particular, y-rays from 
reactions involving possible target impurities were taken into account. 
See Supplementary Information for further details on the data analysis. 
The three most intense y-ray transitions assigned to °*Pd (874 keV, 
912keV and 750keV) have been ordered into a ground-state band 
based on their relative intensities (Fig. 3e). The uncertainties in the 
relative intensities of the y-ray transitions translate into a correspond- 
ing uncertainty in their ordering, and consequently, also in the absolute 
positions of the 2* and 4* states. As shown in Fig. 3, these y-rays form 
a mutually coincident decay sequence. Although the limited statistics 
precludes an accurate angular distribution analysis and hence firm spin 
assignments, it is likely that the 874 keV, 912 keV and 750 keV y-ray 
transitions constitute a cascade of stretched E2 transitions depopulat- 
ing the first excited 2*, 4* and 6" states, respectively (Fig. 3e). 
Nuclei immediately below 100Ch on the Segré (N,Z) chart, with N, 
Z < 50, may show special structural features, as the valence neutrons 
and protons here can move in identical orbits. Here, for the heaviest 
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Figure 3 | Identification of y-ray transitions in Pd. a-c, Gamma-ray 
energy spectra detected in coincidence with the 874 keV, 912 keV and 750 keV 
y-rays, with the additional requirement that two neutrons and no charged 
particle(s) were detected in coincidence. These y-rays, assigned to depopulate 
the 2*, 4* and 6" states in °"Pd, respectively, are marked by filled circles. 
Gamma-rays from the *°Ar-induced 1p1n-evaporation reaction on small 
amounts of carbon deposited on the target during irradiation, leading to the 
production of “°V nuclei, are visible in b and c (open triangles). These y-rays 
appear in the projected spectra owing to a combined effect of the limited 
detection efficiency for charged particles, the finite neutron/y separation in the 
neutron detectors, the presence of y-ray transitions at 914.9 keV and 750.7 keV 
in the level scheme of *°V (refs 21, 22), and the fact that the reaction products 
from carbon contamination may recoil out of the target material, leading to 
Doppler broadening of such y-rays. d, Intensity ratios of the y-rays assigned to 
°2Pd in coincidence with two neutrons and one neutron, respectively. The 
dashed line indicates the value expected for y-rays in coincidence with two 
neutrons, obtained from the relative 1n- and 2n-detection efficiencies. 
Measured intensity ratios for y-rays from previously known reaction products 
(Ru? and IR) are included for comparison. e, Level scheme assigned to 
Pd. The assigned spin-parity (left) and level energy (right, in keV) is given for 
each level. The energies (in keV) and relative intensities (in %, normalized to the 
intensity of the 874 keV transition) of the y-ray transitions assigned to ’Pd are 
as follows: 873.6(2), 100(8); 912.4(2), 77(5); 749.8(3), 50(6). Given uncertainties 
are standard (statistical) errors. 


N~Z nuclei, state-of-the-art shell model calculations predict the 
appearance of ground-state and low-lying yrast structures based on 
spin-aligned systems of np pairs, similar to a scenario proposed more 
than four decades ago''. The np-paired ground-state configuration 
emanates from the strong attractive interaction between go/. neutrons 
and protons in aligned angular momentum (J = 9) coupling, and is 
hence different from the predictions of a BCS type of isoscalar np 
pairing condensate in N~ Z nuclei”. The shell model calculations 
were performed using empirical two-body matrix elements in the f5/2 
P3/2 P1/2 89/2 Model space; see Supplementary Information for details. 

In Fig. 4c we show the results, compared with experimental data for 
°**Pd (this work), *“Pd (ref. 12) and *°Pd (ref. 13). The level structure of 
the semi-magic (N = 50) nucleus *°Pd, with four proton holes relative 
to the Z = 50 closed shell core, exhibits the typical traits of a nucleus in 
the normal isovector pairing phase for which the seniority coupling 
scheme dominates. A transition from the ground state to the first 
excited 2" state requires the breaking of one go. proton-hole pair, 
and therefore the energy spacing between these two levels is rather 
large. The distance between the subsequent levels gradually decreases 
as the angular momentum vectors of the go/. quasiproton holes align 
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Figure 4 | Illustration of the predicted ground-state wavefunctions of °’Pd 
and *°Pd, and comparison of calculated and experimental level energies in 
°2Dd, °*Pd and *°Pd. a, Schematic illustration of the structure of the ground- 
state wavefunction of ’’Pd in the spin-aligned np paired phase (green, neutron 
hole; red, proton hole). The main component of the wavefunction can be 
viewed as a system of deuteron-like np hole pairs with respect to the 3}°Snso 
‘core’, spinning around the centre of the nucleus. b, As a but for °°Pd in the 
normal pairing phase. c, Experimental level energies (keV) in the ground-state 
bands of Pd (present work) and °426Dd (refs 12, 13) compared with shell 
model predictions. Calculated B(E2: 2* > 0*) and B(E2: 4* —> 27) values 


until the 8* state is reached. Here, the angular momentum vectors of 
one pair of proton holes are maximally aligned, and in order to reach 
higher-lying states the other proton-hole pair has to be broken. This 
spin sequence terminates in the 12° state, where all four proton holes 
in the go, orbital are fully aligned. In contrast, the calculated spectrum 
of °’Pd, with four proton holes and four neutron holes relative to the 
*°°Sn core, has a nearly constant energy spacing between consecutive 
levels. To examine the influence from different components of the np 
interaction on this spectrum, we also performed the same calculation 
while including only the T = 0 component of the interaction matrix 
elements, including only the T= 1 components, or excluding all np 
interactions (that is, all T, = 0 components). As seen in Fig. 4c, the 
calculated spectrum of ”’Pd for the latter case has a strong resemblance 
to the spectrum of the closed neutron shell nucleus °°Pd. For the full 
calculation (case SM in Fig. 4c), the calculated energy spectra agree 
very well with those deduced from experiment. It is evident that the 
T = 0 component of the np interactions plays a dominating role for the 
spectrum of °2Pd, whereas such interactions between the valence 
nucleons are absent in *°Pd. The calculated wavefunctions for the 
ground state and low-lying yrast states in °*Pd are completely domi- 
nated by the isoscalar np pairs in the spin-aligned J* = 9* coupling. 
The nucleus **Pd represents an interesting intermediate case. 

A simple semiclassical picture of the isoscalar spin-aligned coupling 
scheme can help to illustrate why the low-lying yrast states in °’Pd are 
nearly equidistant. It is a consequence of the variation in the spatial 
overlap between the valence particle wavefunctions as the angular 
momentum vectors of np hole pairs circling in one direction align 
with the angular momentum vectors of those circling in the opposite 
direction, yielding the total angular momentum 2/, 4h, 6h, and so on. 
This variation is approximately linear for small values of angular 


(Weisskopf units) are also shown in italics below the corresponding initial 
levels. Text at bottom of each set of levels shows nuclide and gives further 
details. The calculated spectra for °2.°4Dd include, in addition to full neutron- 
proton interactions (SM), also results for pure T = 0 and pure T = 1 neutron- 
proton interactions. The results obtained without residual neutron-proton 
interactions (no np: that is, normal seniority coupling involving only isovector, 
T= 1,nnand pp pairing), are also shown for °””*Pd. The vertical dashed lines 
separate information for different nuclides. Blue boxes highlight the 
experimental data for °*Pd and °°Pd, emphasizing their different structures. 


momentum. This mechanism for generating the total angular 
momentum in the nucleus is quite different from those present in 
normal superfluid nuclei. The regularly spaced level sequence 
observed in the full calculation for °*Pd is therefore a distinct signature 
of the spin-aligned isoscalar mode, in the absence of collective vibra- 
tional excitations (see Supplementary Information for further details). 
The fact that the ordering of the experimentally observed y-ray transi- 
tions is affected by some uncertainty does not change the interpreta- 
tion of the data. The effect of a different ordering would be a maximal 
change in the 2* and 4* energies by 124keV and 162 keV, respect- 
ively, still in approximate agreement with the theoretical prediction. 
The special topology of the ground-state wavefunction predicted for 
**Pd is illustrated schematically in Fig. 4a, and may be compared with 
the case for the normal pairing phase in °°Pd (Fig. 4b). In the spin- 
aligned np paired phase, the main component of the nuclear ground- 
state wavefunction can, in a semiclassical picture, be regarded as built 
of a system of deuteron-like np hole pairs spinning around the core, 
each with maximum angular momentum. The special character of the 
wavefunction also implies a deformed intrinsic structure. 

Although the experimental data presented in this Letter strongly 
suggest that a spin-aligned neutron-proton paired phase is present in 
Pd, further experimental information is needed to confirm this inter- 
pretation. In particular, measurements of particle transfer reactions, 
extraction of electromagnetic transition rates (B(E2; 0* 27) values) 
using Coulomb excitation, and precise mass measurements would help 
to elucidate the structural evolution of nuclei along the N = Z line and 
to develop a better understanding of neutron-proton correlations and 
their implications for nuclear shell structure far from stability. This is 
also of importance for understanding the reaction rates as well as the 
end point of the astrophysical rapid proton capture process'*”, which 
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may influence the composition and X-ray burst profiles of accreting 
neutron stars, and the nucleosynthesis of neutron deficient isotopes. 
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Rapid evolutionary innovation during an Archaean 


genetic expansion 


Lawrence A. David! & Eric J. Alm! 


The natural history of Precambrian life is still unknown because of 
the rarity of microbial fossils and biomarkers’”. However, the 
composition of modern-day genomes may bear imprints of ancient 
biogeochemical events* °. Here we use an explicit model of macro- 
evolution including gene birth, transfer, duplication and loss 
events to map the evolutionary history of 3,983 gene families across 
the three domains of life onto a geological timeline. Surprisingly, 
we find that a brief period of genetic innovation during the 
Archaean eon, which coincides with a rapid diversification of 
bacterial lineages, gave rise to 27% of major modern gene families. 
A functional analysis of genes born during this Archaean expan- 
sion reveals that they are likely to be involved in electron-transport 
and respiratory pathways. Genes arising after this expansion show 
increasing use of molecular oxygen (P= 3.4 X 107°) and redox- 
sensitive transition metals and compounds, which is consistent 
with an increasingly oxygenating biosphere. 

Describing the emergence of life on our planet is one of the grand 
challenges of the biological and Earth sciences. Yet the roughly three- 
billion-year history of life preceding the emergence of hard-shelled 
metazoans remains largely unknown". So far, the best-understood 
event in early Earth history is the Great Oxidation Event, which is 
believed to have followed the development of oxygenic photosynthesis 
by ancestors of modern cyanobacteria’ (although the precise timeline 
remains controversial**). If DNA sequences from extant organisms 
bear an imprint of this event, they can be used to make and test 
predictions; for example, genes that use molecular oxygen are more 
likely to appear in organisms that emerged after the Great Oxidation 
Event. However, the transfer of genes across species can obscure patterns 
of descent and disrupt our ability to correlate gene histories with the 
geochemical record’. For example, widely distributed genes may descend 
from a Last Universal Common Ancestor, as is widely believed to have 
occurred for the translational machinery’, or they may have been dis- 
persed by horizontal gene transfer (HGT)"””’, as with antibiotic resistance 
cassettes. 

We developed a new phylogenomic method, AnGST (analyser of 
gene and species trees), that explicitly accounts for HGT by comparing 
individual gene phylogenies with the phylogeny of organisms (the ‘tree 
of life’) and generated detailed evolutionary histories for 3,983 major 
gene families. Gene histories reveal marked changes in the rates of gene 
birth, gene duplication, gene loss and HGT over geological timescales 
(Fig. 1), including a burst of de novo gene-family birth between 3.33 
and 2.85 Gyr ago, which we refer to as the Archaean Expansion. This 
window gave rise to 26.8% of extant gene families and coincides with a 
rapid bacterial cladogenesis (Supplementary Fig. 15). A spike in the 
rate of gene loss (about 3.1 Gyr ago) follows the expansion and may 
represent the consolidation of newly evolved phenotypes, as ancestral 
genomes became specialized for emerging niches. After 2.85 Gyr ago, 
the rates of both gene loss and gene transfer stabilized at roughly 
modern-day levels. The rates of de novo gene birth and duplication 
after the Archaean Expansion seem to show opposite trends: de novo 
gene-family birth rates decrease and duplication rates increase over 


time. The near absence of de novo birth in modern times probably 
reflects the fact that ORFan gene families (gene families found in only a 
single genome), which are widespread across all major prokaryotic 
groups, are not considered in this study’*. The excess of gene duplica- 
tions and ORFans in modern genomes suggests that novel genes from 
both sources experience high turnover. Although we did not observe 
changes in the rate of HGT after the Archaean Expansion, we did 
detect an over-representation of HGT from «-proteobacteria to 
ancient eukaryotes (P = 3.3 X 10 ’, Wilcoxon rank sum test) and 
from cyanobacteria to plants (P = 8.3 X 10 °, Wilcoxon rank sum 
test). These patterns of HGT probably reflect the endosymbioses that 
gave rise to mitochondria and chloroplasts’*"’, and serve to validate 
our phylogenomic approach. 

What evolutionary factors were responsible for the period of 
innovation marked by the Archaean Expansion? Although we cannot 
provide an unequivocal answer to this question with the use of gene 
birth dates alone, we can ask whether the functions of genes born 
during this time suggest plausible hypotheses. In general, birth of 
metabolic genes was enriched during the expansion, and especially 
those involved in energy production and coenzyme metabolism (Sup- 
plementary Table 2); however, further inspection also reveals an 
enrichment for metabolic-gene-family birth before the Archaean 
Expansion. To focus on specific metabolic changes linked to the 
Archaean Expansion we first grouped genes according to the metabolites 
they used, and then directly compared the occurrence of these metabo- 
lites in genes born during the Archaean Expansion with their abundance 
before the Archaean Expansion. The results are striking: the metabolites 
specific to the Archaean Expansion (positive bars in Fig. 2 inset) include 
most of the compounds annotated as redox/e transfer (blue bars), with 
Fe-S-binding, Fe-binding and O,-binding gene families showing the 
most significant enrichment (false discovery rate < 5%, Fisher’s exact 
test). Gene families that use ubiquinone and FAD (key metabolites in 
respiration pathways) are also enriched, albeit at slightly lower signifi- 
cance levels (false discovery rate << 10%). The ubiquitous NADH and 
NADPH are a notable exception to this trend and seem to have had a 
function early in life history. By contrast, enzymes linked to nucleotides 
(green bars) showed strong enrichment in genes of more ancient origin 
than the expansion. 

The observed bias in metabolite use suggests that the Archaean 
Expansion was associated with an expansion in microbial respiratory 
and electron transport capabilities. Proving this association to be causal 
is beyond the power of our phylogenomic model. Yet this hypothesis is 
appealing because more efficient energy conservation pathways could 
increase the total free-energy budget available to the biosphere, possibly 
enabling the support of more complex ecosystems and a concomitant 
expansion of species and genetic diversity. We note, however, that 
although the use of oxygen as a terminal electron acceptor would have 
significantly increased biological energy budgets, oxygen-using genes 
were only enriched towards the end of the expansion (Supplementary 
Fig. 10). Thus, the earliest redox genes identified as part of the expansion 
were likely to have been used in anaerobic respiration or in oxygenic or 
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Figure 1 | Rates of macroevolutionary events over time. Average rates of gene 
birth (red), duplication (blue), HGT (green), and loss (yellow) per lineage (events 
per 10 Myr per lineage) are shown. Events that increase gene count are plotted to 
the right, and gene loss events are shown to the left. Genes already present at the 
Last Universal Common Ancestor are not included in the analysis of birth rates 
because the time over which those genes formed is not known. The Archaean 
Expansion (AE) was also detected when 30 alternative chronograms were 
considered (Supplementary Fig. 9). The inset shows metabolites or classes of 
metabolites ordered according to the number of gene families that use them that 
were born during the Archaean Expansion compared with the number born 


anoxygenic photosynthesis and may have been co-opted later for use in 
aerobic respiration pathways. 

Our metabolic analysis supports an increasingly oxygenated bio- 
sphere after the Archaean Expansion, because the fraction of proteins 
using oxygen gradually increased from the expansion to the present 
day (Fig. 2; P=3.4 X 10 °, two-sided Kolmogorov-Smirnov test). 
Further indirect evidence of increasing oxygen levels comes from com- 
pounds whose availability is sensitive to global redox potential. We 
observe significant increases over time in the use of the transition metals 
copper and molybdenum (Fig. 2; false discovery rate < 5%, two-sided 
Kolmogoroy-Smirnov test), which is in agreement with geochemical 
models of these metals’ solubility in increasingly oxidizing oceans”® and 
with molybdenum enrichments from black shales suggesting that mol- 
ybdenum began accumulating in the oceans only after the Archaean 
eon'®. Our prediction of a significant increase in nickel utilization 
accords with geochemical models that predict a tenfold increase in 
the concentration of dissolved nickel between the Proterozoic eon 
and the present day” but conflicts with a recent analysis of banded iron 
formations that inferred monotonically decreasing maximum concen- 
trations of dissolved nickel from the Archaean onwards’’. The abund- 
ance of enzymes using oxidized forms of nitrogen (N2O and NO3) also 
grows significantly over time, with one-third of nitrate-binding gene 
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before the expansion, plotted on a log, scale. Metabolites whose enrichments are 
statistically significant at a false discovery rate of less than 10% or less than 5% 
(Fisher’s Exact Test) are identified with one or two asterisks, respectively. Bars are 
coloured by functional annotation or compound type (functional annotations 
were assigned manually). Metabolites were obtained from the KEGG database 
release 51.0 (ref. 27) and associated with clusters of orthologous groups of 
proteins (COGs) using the MicrobesOnline September 2008 database”. 
Metabolites associated with fewer than 20 COGs or sharing more than two- 
thirds of gene families with other included metabolites are omitted. 
Abbreviations are defined in Supplementary Table 3. 


families appearing at the beginning of the expansion and three-quarters 
of nitrous-oxide-binding gene families appearing by the end of the 
expansion. The timing of these gene-family births provides phyloge- 
nomic evidence for an aerobic nitrogen cycle by the Late Archaean”*. 

However, one striking discrepancy between our phylogenomic 
patterns and geochemical predictions is a modest but significant 
increase in iron-using genes over time (Fig. 2; false discovery rate < 
5%, two-sided Kolmogorov-Smirnov test). Declining iron solubility 
in oxygenated ocean surface waters and sulphide-mediated removal of 
iron from anoxic deeper waters are thought to have decreased overall 
iron bioavailability during the Proterozoic”. If the abundance of iron- 
using genes tracks iron bioavailability, we would expect these genes to 
decrease in abundance after the Archaean. The conflicting phyloge- 
nomic result may reflect the confounding effect of evolutionary inertia, 
whereby microbes could have found more success in evolving a handful 
of metal-acquisition proteins (for example siderophores) rather than 
replacing a host of iron-binding proteins in the face of declining iron 
availability’. Alternatively, the insolubility of iron in modern oceans 
may be offset by large organic pools of iron. 

Our chronologies of oxygen and redox-sensitive metal and compound 
utilization suggest ancient increases in oxygen bioavailability, as well as 
an Archaean biosphere with some of the basic genetic components 
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Figure 2 | Genome utilization of redox-sensitive compounds over time. The 
top panel illustrates a gradual increase in the fraction of enzymes that bind 
molecular oxygen predicted to be present over Earth history (P = 3.4 x 10°, 
two-sided Kolmogorov-Smirnoyv test). Colours indicate abundance 
normalized to present-day values. The lower four panels group transition 
metals, nitrogen compounds, sulphur compounds and C,; compounds. The 
fraction of each group’s associated genes that bind a given compound, 
normalized to present-day fractions, is shown over time with a colour gradient. 
Enclosed boxes show raw fractional values at three time points: 3.5 Gyr ago 
(left), 2.5 Gyr ago (middle) and the present day (right). For example, 19.2% of 


required for oxygenic photosynthesis and respiration. These results are 
consistent with recent biomarker-based evidence for oxygenesis preced- 
ing the Palaeoproterozoic era by hundreds of millions of years”. Still, a 
precise timeline for the origins of oxygenesis is currently beyond the 
resolution of our phylogenomic model. In the results described above, 
we estimated lineage divergence times with PhyloBayes”’, which enabled 
us to explicitly account for uncertainty in the timing of inferred events 
(Supplementary Fig. 13). An alternative model of evolutionary rates” 
dated the rapid bacterial cladogenesis to 2.75-2.5 Gyr ago (in contrast to 
3.33-2.85 Gyr for PhyloBayes) but still finds evidence for an Archaean 
Expansion (Supplementary Fig. 9) characterized by the emergence of 
electron transport genes. Uncertainty or errors in the reference tree may 
further decrease the power of our phylogenomic model, obscuring 
evidence for all except the most extreme geochemical events. Future 
studies that benchmark biomarker and other geochemical data against 
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transition-metal-binding genes are predicted to have bound Mn 2.5 Gyr ago, a 
value 1.28-fold the size of the present-day percentage of 15.0%. Values within 
parentheses give the overall number of gene families in each group. To 
determine which compounds showed divergent genome utilization over time, 
the timing of copy number changes for each compound’s associated genes was 
compared with a background model derived from all other compounds. 
Compounds whose utilization significantly differs from the background model 
are marked with an asterisk (false discovery rate < 5%, two-sided Kolmogorov— 
Smirnov test). Nitrite and nitric oxide are not shown, because of their COG- 
binding similarity to nitrate and nitrous oxide, respectively. 


the predicted age of associated gene families could be used to test and 
refine the ‘tree of life’, ultimately yielding an abundant and reliable source 
of Precambrian fossils: modern-day genomes. 


METHODS SUMMARY 


We developed AnGST to account for gene transfer, duplication, loss and de novo 
birth by comparing individual gene phylogenies with a previously described ref- 
erence phylogeny**. We refer to this process as tree reconciliation and provide a 
detailed description of the AnGST algorithm in Supplementary Methods. Unlike 
some previous methods**’*, AnGST uses the topology of the gene family tree 
rather than just its presence or absence across genomes and can infer the direction 
of gene transfer in addition to gene duplication, birth and loss events. AnGST also 
accounts for uncertainty in gene trees by incorporating reconciliation into the tree- 
building process: the tree that minimizes the macroevolutionary cost function but 
is still supported by the sequence data is chosen as the best gene tree. To assess the 
sensitivity of our method to the reference tree topology, we reconciled gene families 
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against 30 alternative reference trees rooted on either the bacterial, archaeal or 
eukaryotic branches. Inferred gene-family birth ages were consistent across the 
ensemble of reference trees, and the Archaean Expansion was a uniformly 
observed feature (Supplementary Figs 8 and 9). A conservative set of eight 
temporal constraints was selected from the geochemical and palaeontological 
literature (Supplementary Fig. 7 and Supplementary Table 1), and the 
PhyloBayes software package was used to infer a range of divergence times for 
each ancestral lineage on the reference tree*'. We did not apply temporal con- 
straints to lineage ages on the gene trees. 
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The activated B-cell-like (ABC) subtype of diffuse large B-cell 
lymphoma (DLBCL) remains the least curable form of this malig- 
nancy despite recent advances in therapy’. Constitutive nuclear 
factor (NF)-kB and JAK kinase signalling promotes malignant cell 
survival in these lymphomas, but the genetic basis for this signalling 
is incompletely understood. Here we describe the dependence of 
ABC DLBCLs on MYD88, an adaptor protein that mediates toll 
and interleukin (IL)-1 receptor signalling”’, and the discovery of 
highly recurrent oncogenic mutations affecting MYD88 in ABC 
DLBCL tumours. RNA interference screening revealed that 
MYD88 and the associated kinases IRAK1 and IRAK4 are essential 
for ABC DLBCL survival. High-throughput RNA resequencing 
uncovered MYD88 mutations in ABC DLBCL lines. Notably, 29% 
of ABC DLBCL tumours harboured the same amino acid sub- 
stitution, L265P, in the MYD88 Toll/IL-1 receptor (TIR) domain 
at an evolutionarily invariant residue in its hydrophobic core. This 
mutation was rare or absent in other DLBCL subtypes and Burkitt's 
lymphoma, but was observed in 9% of mucosa-associated lymphoid 
tissue lymphomas. At a lower frequency, additional mutations were 
observed in the MYD88 TIR domain, occurring in both the ABC 
and germinal centre B-cell-like (GCB) DLBCL subtypes. Survival of 
ABC DLBCL cells bearing the L265P mutation was sustained by the 
mutant but not the wild-type MYD88 isoform, demonstrating that 
L265P is a gain-of-function driver mutation. The L265P mutant 
promoted cell survival by spontaneously assembling a protein 
complex containing IRAK1 and IRAK4, leading to IRAK4 kinase 
activity, IRAK1 phosphorylation, NF-«B signalling, JAK kinase 
activation of STAT3, and secretion of IL-6, IL-10 and interferon-f. 
Hence, the MYD88 signalling pathway is integral to the pathogenesis 
of ABC DLBCL, supporting the development of inhibitors of IRAK4 
kinase and other components of this pathway for the treatment of 
tumours bearing oncogenic MYD88 mutations. 

The current molecular taxonomy of DLBCL distinguishes three 
main subtypes: ABC, GCB and primary mediastinal B-cell lymphoma 
(PMBL)*. Current therapy is least successful in ABC DLBCL, achieving 
less than a 40% cure rate’. The anti-apoptotic NF-«B signalling path- 
way is constitutively active in ABC DLBCL owing to oncogenic 
CARD11 mutations or chronic active B-cell receptor signalling, aug- 
mented by inactivation of A20°*. A subset of ABC DLBCLs use JAK 


kinase signalling to activate the transcription factor STAT3, a pathway 
that synergizes with NF-«B in promoting cell survival’"®. The onco- 
genic aetiology of this JAK-STATS signalling has not been elucidated. 

We conducted an RNA interference (RNAi) screen for genes that 
are required for proliferation and survival of lymphoma cell lines and 
identified three small hairpin RNAs (shRNAs) targeting MYD88 that 
were toxic to two ABC DLBCL lines but not to two GCB DLBCL lines 
(Supplementary Fig. 1a). During normal immune responses, MYD88 
functions as a signalling adaptor protein that activates the NF-«B 
pathway after stimulation of toll-like receptors (TLRs) and receptors 
for IL-1 and IL-18 (refs 2, 3). MYD88 coordinates the assembly of a 
multi-subunit signalling complex consisting of various members of the 
IRAK family of serine-threonine kinases’. The initial RNAi screen 
also identified two shRNAs targeting IRAK1 as toxic for one or both 
of the ABC DLBCL lines, but not for GCB DLBCL lines. A subsequent 
screen identified additional MYD88 and IRAK1 shRNAs that were 
toxic to all five ABC DLBCL lines tested but had little effect on GCB 
DLBCL, Burkitt’s lymphoma, mantle cell lymphoma and multiple 
myeloma lines (Supplementary Fig. 1a). Using shRNAs targeting the 
3’ untranslated regions of MYD88 and IRAK1, which reduced expres- 
sion of their respective proteins (Supplementary Fig. 1c), we showed 
that ABC DLBCL cells could be rescued from shRNA-mediated toxicity 
by coexpression of coding region cDNAs (IRAK1, Supplementary Fig. 
1d; MYD88, see below). MYD88 and IRAK1 shRNAs displayed a time- 
dependent toxicity for ABC DLBCL lines and induced apoptosis, but 
had little effect on GCB DLBCL and myeloma lines (Fig. 1 and Sup- 
plementary Fig. 1b, e). Together these data establish that MYD88 and 
IRAK1 are required to maintain the viability of ABC DLBCL cells. 

To comprehensively discover somatic mutations in ABC DLBCL, we 
used high-throughput resequencing of mRNA to search for sequence 
variants in four ABC DLBCL lines. In addition to known mutations in 
CARD11 and CD79B, we identified a single nucleotide variant that 
changed a leucine residue at position 265 of the MYD88 coding region 
to proline (L265P) in all four ABC DLBCL lines tested. This variant 
resides in the MYD88 TIR domain, which interacts with TIR domains 
of various receptors during innate immune responses and also mediates 
homotypic interactions’*"’. 

To extend this finding, we resequenced the MYD88 coding region in 
382 lymphoma biopsy samples. The L265P mutation was by far the most 
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Figure 1 | MYD88 is required for survival of ABC DLBCL cells. MYD88 and 
IRAK1 shRNAs have selective toxicity for ABC DLBCL lines. Shown is the 
fraction of GFP*, shRNA-expressing cells relative to the GFP, shRNA- 
negative fraction at the indicated times, normalized to the day 0 values. 


common variant observed, occurring in 29% of ABC DLBCL biopsies. 
By contrast, this mutation was rare or absent among DLBCLs of the GCB 
and PMBL subtypes and among Burkitt’s lymphomas (Fig. 2b). Of note, 
MYD88 L.265P was also observed in 9% of gastric mucosa-associated 
lymphoid tissue (MALT) lymphomas. Most MYD88 L265P mutations 
appeared heterozygous by sequencing, but six biopsy samples and one 
ABC DLBCL line (OCI-Ly3) were homozygous. By array-based com- 
parative genomic hybridization’, 56% (15 of 27) of the ABC DLBCL 
cases with gain or amplification of the MYD88 locus had the L265P 
mutation, compared to 29% (13 of 45) with wild-type MYD88 copy 
number (P = 0.023), indicating selection by the cancer cells for this 
mutant allele. A host of other, less common MYD88 mutations were 
equally distributed among ABC and GCB DLBCL cases (Fig. 2a, b). 
Whereas most mutations were in the TIR domain, one mutation 


(V52M) was in the death domain and two were between the death 
and TIR domains (S149G/I). Six ABC DLBCL lines had a MYD88 
mutation (Fig. 1), whereas all 14 GCB DLBCL lines tested were wild 
type. In 13 DLBCL cases for which matched germline DNA was 
available, the MYD88 mutations (L265P, V217F, S219C, M232T, 
S243N, T294P) were confirmed to be somatically acquired. Overall, 
MYD88 mutations were observed in 39% of ABC DLBCLs (Fig. 2b), 
establishing MYD88 as among the most frequently altered genes in 
this malignancy. 

The MYD88 mutations partially overlapped with abnormalities in 
CD79B/A, A20 and CARD11 in ABC DLBCL tumours (Fig. 2c). Among 
cases with a MYD88 L265P mutation, 34% had a coincident CD79B/A 
mutation whereas this overlap was significantly less common among 
ABC DLBCLs without a CD79B/A mutation (18%; P = 0.03). These 
data raise the possibility ofa functional interaction between the chronic 
active B-cell receptor signalling that is associated with CD79B/A muta- 
tions® and the signalling that is instigated by the MYD88 L265P muta- 
tion. Some cases had MYD88 L265P as well as a CARD11 mutation, 
which strongly activates NF-«B, suggesting that the MYD88 mutation 
confers additional biological attributes beyond NF-kB activation. 

The location of the MYD88 mutations within the three-dimensional 
structure of the MYD88 TIR domain was both surprising and instruc- 
tive (Fig. 2d). The L265P mutation occurs at a residue that is invariant 
in evolution and contributes to a B-sheet at the hydrophobic core of the 
domain. Another mutation, M232T, affects a methionine that is in an 
adjacent B-sheet and contacts the leucine affected by L265P. A cluster 
of mutations were in the ‘B-B loop’, an evolutionarily conserved region 
that mediates TIR domain interactions'®. Two other mutations, $222R 
and S243N, alter an adjoining face of the TIR domain. Only one 
mutant affects the opposite side of the TIR domain (T294P), altering 
the conserved ‘box 3’ motif that is important in IL-1 signalling”. 

To examine whether the MYD88 mutants confer a gain or loss of 
function, we performed a complementation experiment in which we 
knocked down endogenous MYD88 in ABC DLBCL lines and ectopi- 
cally expressed wild-type or mutant MYD88 coding regions. In ABC 
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Figure 2 | MYD88 mutations in human lymphomas. a, MYD88 missense 
mutations in lymphoma biopsies and cell line models of ABC DLBCL (light 
blue), GCB DLBCL (orange), MALT lymphoma (purple) and Burkitt’s 
lymphoma (BL; dark blue). Amino acid positions are shown according to 
protein accession NP_002459. b, Frequencies of MYD88 mutations in biopsy 
samples from different lymphoma subtypes. c, Overlap of MYD88 mutations 
with other recurrent genetic alterations in ABC DLBCL tumour specimens. 
Genetic subsets were defined by somatic mutations and, in the case of the A20 
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subset, by homozygous deletion or epigenetic silencing. d, Location of MYD88 
mutations in the three-dimensional structure of the MYD88 TIR domain. 

e, Dependence of ABC DLBCLs on MYD88 L265P. A 3'-UTR-directed MYD88 
shRNA was inducibly expressed in the indicated ABC DLBCL lines, which were 
stably transduced with rescue vectors expressing wild-type or L265P MYD88 
coding regions, or with an empty vector. Shown is the fraction of viable shRNA- 
expressing cells relative to the shRNA-negative fraction, normalized to day 0 
values. 
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DLBCL lines harbouring an L265P mutation, MYD88 L265P rescued 
the cells after MYD88 knockdown, but wild-type MYD88 was ineffec- 
tive (Fig. 2e), although these MYD88 isoforms were expressed equiva- 
lently (data not shown). Hence, these ABC DLBCLs are ‘addicted’ to 
the action of the L265P MYD88 mutant, indicating that it is a gain-of- 
function driver mutation that confers a selective advantage during the 
evolution of ABC DLBCL tumours. 

To assess the biochemical and functional consequences of the MYD88 
mutations, we fused green fluorescent protein (GFP) to MYD88 and 
introduced the fusion proteins into DLBCL lines. Immunoprecipitation 
of MYD88-GFP with anti-GFP antibodies brought down IRAK1 and 
IRAK4, two kinases known to associate with MYD88 upon TLR or 
IL-1 stimulation (Fig. 3a)’. During IL-1 signalling, IRAK1 becomes 
hyperphosphorylated by IRAK4, resulting in slowly migrating IRAK1 
isoforms’. In cells bearing the MYD88 L265P, a prominent, slow- 
migrating IRAK1 species co-immunoprecipitated with MYD88 
(Fig. 3a). By contrast, wild-type MYD88 did not associate strongly with 
these IRAK1 isoforms nor did the other MYD88 mutants tested 
(Fig. 3b). Treatment with A-phosphatase collapsed the slow-migrating 
IRAKI species into a single band, confirming that they are phosphory- 
lated IRAK1 isoforms (Fig. 3c). Phosphorylation of endogenous IRAK1 
was observed in an ABC DLBCL line with L265P but not in a GCB 
DLBCL line (Fig. 3c). Thus, the MYD88 L265P mutant nucleates a 
signalling complex in ABC DLBCLs that includes phosphorylated 
IRAKI, consistent with a gain-of-function phenotype. 

IRAK4 co-immunoprecipitated with MYD88, but it associated 
equivalently with wild-type and L265P MYD88 (Fig. 3a). Knockdown 
of IRAK4 was toxic for ABC DLBCL lines but not for GCB DLBCL and 
myeloma lines (Fig. 3d and Supplementary Fig. 1c). Wild-type IRAK4 
rescued ABC DLBCL lines after IRAK4 shRNA induction, but a kinase- 
dead IRAK4 isoform could not (Fig. 3e), despite equivalent expression 
(data not shown). By contrast, IRAK1 kinase activity was not required 
for the survival of ABC DLBCL cells (Supplementary Fig. 1d). A select- 
ive small-molecule inhibitor of IRAK1 and IRAK4 kinase activity’” 
killed ABC DLBCL lines but not GCB DLBCL and myeloma lines 
(Fig. 3f). Together, these findings demonstrate that ABC DLBCLs rely 
upon IRAK4 kinase activity to transduce signals from MYD88 L265P 
that promote survival. 

To investigate signalling pathways that are engaged by MYD88 
L265P, we knocked it down in an ABC DLBCL line and profiled the 
ensuing gene expression changes (Supplementary Table 1 and Sup- 
plementary Fig. 2). We identified 285 genes that were down-modulated 
after MYD88 knockdown, and searched for overlap between this 
MDY83 signature and previously defined gene expression signatures’® 
(Supplementary Table 2). The most significantly enriched signature 
reflects NF-«B signalling in ABC DLBCL (44x enrichment, 
P=24%X10 '*°). This signature was also inhibited after [RAKI 
knockdown (Supplementary Fig. 3), indicating that IRAK1 mediates 
NF-«B activation by MYD88 L265P. To compare the ability of wild- 
type and mutant MYD88 isoforms to activate NF-KB, we expressed 
them as GFP fusion proteins in a GCB DLBCL line with little endo- 
genous NF-«B activity. Whereas wild-type MYD88 activated an NF- 
«B-dependent reporter modestly, L265P had strong activity, as did 
M232T and S243N, whereas $222R and T294P had an intermediate 
effect (Fig. 4b). At all MYD88 expression levels, L265P was superior to 
wild-type MYD88 in upregulating CD83, a previously established NF- 
kB target in this system° (Fig. 4c). Other MYD88 mutants induced 
CD83 to varying degrees but all were more active than wild-type 
MYD88. Thus, mutant MYD88 isoforms can contribute to the con- 
stitutive NF-«B activation that typifies ABC DLBCL”. 

A signature of JAK kinase signalling in ABC DLBCL overlapped 
significantly with the MYD88 signature (Fig. 4a; 14x enrichment, 
P=9.6 X 10 *) and with IRAK1-regulated genes (Supplementary 
Fig. 3b). This was notable because autocrine secretion of IL-6 and 
IL-10 drives JAK-STAT3 signalling in a subset of ABC DLBCLs’. 
MYD88 knockdown significantly diminished the secretion of IL-6 and 
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Figure 3 | MYD88 mutations are gain-of-function. a, An altered IRAK1 
isoform associated with MYD88 L265P. The GCB DLBCL line BJAB was 
transduced with GFP-tagged wild-type (WT) or L265P MYD88, or with an 
empty vector (null). Anti-GFP immunoprecipitates (IP) and input lysates were 
examined by immunoblotting for IRAK1, IRAK4 and MYD88. b, Preferential 
association of an altered IRAK1 isoform with MYD88 L265P. BJAB cells were 
transduced with the indicated GFP-tagged MYD88 isoforms and examined as 
ina. c, MYD88 L265P associates with phosphorylated IRAK1. Top panel: BJAB 
cells were transduced with the indicated GFP-tagged MYD88 isoforms. Anti- 
GFP immunoprecipitates were treated with \-phosphatase as indicated and 
examined by immunoblotting for IRAK1 or MYD88. Bottom panel: anti- 
IRAK1 immunoprecipitates from HBL1 (ABC) or BJAB (GCB) cells were 
treated with A-phosphatase as indicated and examined by immunoblotting for 
IRAK1. d, Toxicity of IRAK4 shRNAs for ABC DLBCLs. The indicated lines 
were transduced with retroviruses expressing IRAK4 shRNA and the relative 
number of shRNA” cells is plotted versus time after shRNA induction, 
normalized to day 0. Data are representative of experiments with three different 
IRAK4 shRNAs. e, IRAK4 kinase activity is required for ABC DLBCL survival. 
The indicated ABC DLBCL lines were transduced with retroviruses expressing 
wild-type or kinase-dead IRAK4 isoforms, or with an empty vector (-ctrl). The 
survival of cells after induction of an IRAK4 shRNA is shown. f, A small- 
molecule IRAK1/4 kinase inhibitor is selectively lethal for ABC DLBCLs. 
Viability of the indicated lines was measured after treatment for 3 days with 
various inhibitor concentrations and normalized to DMSO-treated cells. 

g, IRAK4 kinase activity regulates IL-6 and IL-10 secretion. The indicated 
cytokines were measured in the supernatant of ABC DLBCL lines after 
treatment for 24h with various concentrations of the IRAK1/4 inhibitor. 


IL-10 as well as the phosphorylation of STAT3 in several ABC DLBCL 
lines (Fig. 4d, e and Supplementary Fig. 1f). IL-6 and IL-10 secretion was 
also blocked by the IRAK1/4 kinase inhibitor, indicating that IRAK4 
links MYD88 L265 signalling to the expression of these cytokines 
(Fig. 3g). Previous work identified a ‘“STAT3-high’ subgroup of ABC 
DLBCL tumours with autocrine IL-6/IL-10 signalling and STAT3 phos- 
phorylation, which was missing in a ‘STAT3-low’ subgroup’. The 
MYD88 1265P mutation was significantly more common in the 
STAT3-high subgroup (37%) than in the STAT3-low subgroup (13%) 
(P = 0.0036), and other MYD88 mutations were modestly enriched 
among STAT3-high cases as well (Fig. 4f). These data indicate that 
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MYD88 mutations contribute to JAK-STAT3 signalling in ABC DLBCL 
tumours. 

The MYD88 signature included a number of genes that are induced 
by type I interferon (Fig. 4a; 7 enrichment, P = 4.3 X 10 ny which is 
intriguing given that MYD88 signalling can induce interferon (IFN)-B 
production by innate immune cells. IFN-B was measurable in the super- 
natant of the OCI-Ly3 ABC DLBCL line, and MYD88 knockdown 
diminished its secretion (Fig. 4d). MYD88 knockdown decreased 
IFN-B mRNA levels in the HBL1 line (Supplementary Fig. 2), although 
IEFN-f secretion was below the detection limit. Future work should 
address whether the secretion of immunomodulatory cytokines such 
as IFN-f, IL-6 and IL-10 influences immune cells in the microenviron- 
ment of ABC DLBCL tumours. 

Given the pleiotropic action of MYD88 L265P, we investigated 
whether MYD88 signalling cooperates with B-cell receptor signalling 


4 | NATURE | VOL 000 | 00 MONTH 2010 


Figure 4 | MYD88 mutants activate NF-«B and cytokine signalling. a, Venn 
diagram of genes affected by MYD88 knockdown in the HBL1 ABC DLBCL 
line, grouped according to membership in gene expression signatures. 

b, MYD88 mutants constitutively activate NF-KB. BJAB cells co-expressing the 
indicated MYD88-GFP mutants and a NF-«B-driven luciferase reporter 
construct were assayed for luciferase activity, which was normalized to the 
expression levels of each MYD88-GFP isoform. a.u., arbitrary units. 

c, Correlation of MYD88 protein levels with CD83 expression. BJAB cells 
bearing GFP-tagged MYD88 isoforms were assessed for CD83 and GFP 
expression. Cells were assigned to equally sized bins based on their GFP levels, 
and the mean fluorescence intensity (m.f.i.) of CD83 in each bin was plotted. 
d, MYD88 knockdown decreases cytokine secretion in ABC DLBCL. MYD88 
or control (ctrl) shRNAs were induced in ABC DLBCL lines, and the indicated 
cytokines were measured in the supernatant over time. e, STAT3 
phosphorylation in ABC DLBCL depends on MYD88. MYD88 or control (ctrl) 
shRNAs were induced in ABC DLBCL lines, and cells were assessed by 
immunoblotting for phosphorylated STAT3 (pY-STAT3), total STAT3, 
MYD838 and f-actin. f, Preferential association of MYD88 mutant isoforms 
with the STAT3-high subgroup of ABC DLBCL tumours. See text for details. 
g, MYD88 and B-cell-receptor signalling pathways cooperate to maintain ABC 
DLBCL survival. OCI-Ly10 ABC DLBCL cells were first transduced with 
retroviruses expressing MYD88 or control shRNAs and then infected with 
retroviruses expressing CD79A, CARD11, or control shRNAs along with GFP. 
The relative viability of GFP™ cells is plotted, normalized to day 0 values. All 
error bars are s.e.m. (n = 3). 


to maintain ABC DLBCL survival. Knockdown of MYD88 enhanced 
the killing of an ABC DLBCL line with chronic active BCR signalling 
by CD79B or CARDI1 shRNAs (Fig. 4g). This finding indicates that 
MYD838 and B-cell receptor signalling provide non-redundant survival 
signals to ABC DLBCL cells, in keeping with the fact that some ABC 
DLBCL tumours harbour MYD88 L265P as well as CD79B or CARD 11 
mutations (Fig. 2c). 

Our genetic and functional data establish a new oncogenic pathway 
in lymphomagenesis. Somatically acquired MYD88 mutations in ABC 
DLBCL promote NF-«B and JAK-STATS3 signalling, which mediate 
cell survival in this lymphoma type”’’. MYD88 L265P was the most 
biologically potent mutant and was unique in its ability to coordinate a 
stable signalling complex containing phosphorylated IRAK1, which 
probably accounts for its high recurrence among lymphomas. MYD88 
L265P also genetically links MALT lymphoma with ABC DLBCL, two 
lymphoma subtypes that share other oncogenic features”**”°””, Other 
MYD88 mutations may also be drivers of lymphomagenesis given their 
recurrent nature and ability to activate NF-«B. From a therapeutic 
standpoint, the signalling complex coordinated by MYD88 L265P 
represents an enticing target. Our study also provides a genetic method 
to identify patients whose tumours may depend upon MYD88 signal- 
ling and who may therefore benefit from therapies targeting IRAK4 
alone or in combination with agents targeting the B-cell receptor®, NF- 
«kB or JAK-STAT3 pathways’. 


METHODS SUMMARY 


RNAi screens, doxycycline-inducible shRNA expression and shRNA toxicity 
assays were described previously”. RNA interference screening results are listed 
in Supplementary Table 3. The sequences of individual shRNAs described in the 
figures are given in Supplementary Table 4. Gene expression profiling data have 
been submitted to GEO under accession number GSE22900. 

Pre-treatment tumour biopsies were obtained from patients with de novo DLBCL, 
Burkitt’s lymphoma and gastric MALT lymphoma. Samples were analysed as per a 
protocol approved by the National Cancer Institute Institutional Review Board. 
Assignment of DLBCL specimens to subtypes was performed as described’. High- 
throughput RNA sequencing was accomplished using an IIlumina GAIIx instrument. 


Full Methods and any associated references are available in the online version of 
the paper at www.nature.com/nature. 
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METHODS 

Cell lines. Cell lines were cultured in RPMI 1640 medium supplemented with 
penicillin/streptomycin and 10% fetal bovine serum or, for the OCI series of cell 
lines, Iscove’s medium with 20% fresh human plasma. Cells were maintained in a 
humidified, 5% CO, incubator at 37 °C. All cell lines were engineered to express an 
ecotropic retroviral receptor and the bacterial tetracycline repressor as previously 
described”. 

Retroviral vectors and retroviral transduction. A previously described retroviral 
vector, pRSMX”°, was used to express shRNA for the library screen. A modified 
version of this vector, pRSMX-PG, in which the puromycin selectable marker was 
fused with the green fluorescence protein (GFP), was used to co-express shRNA 
and GFP for shRNA toxicity assay. Retroviral transduction was performed by 
transfecting the retroviral vector and a mixture of helper plasmids for a mutant 
ecotropic envelope and gag and pol into 293T cells using Lipofectamine 2000 
(Invitrogen). Retroviral supernatants were harvested 48h after transfection and 
were used to transduce ecotropic receptor-expressing target cells by centrifugation 
at 2,500 r.p.m. for 1.5h in the presence of 8 1g ml’ polybrene. 

shRNA library screening. Pools of shRNA library were screened as previously 
described”. Briefly, pools of roughly 1,000 shRNA expressing retroviral vectors 
were used to transduce target cell lines. After puromycin selection, stable integrants 
were induced to express shRNA by doxycycline (20ng ml‘). Parallel uninduced 
cultures were kept as control. After 3 weeks of shRNA induction, genomic DNA 
from both uninduced and induced cultures were harvested. shRNA-associated bar 
code sequences from the genomic DNA were PCR amplified and in vitro tran- 
scribed, as described”’. The transcribed RNA products were labelled fluorescently 
with either Cy5 (induced) or Cy3 (uninduced) using the Universal Linkage 
System (Amersham Biosciences) and hybridized onto microarrays containing 
DNA oligonucleotides complementary to the bar code sequences, as described”. 
Each bar code experiment was performed in quadruplicate, and the microarray 
results for each bar code were averaged. The complete screening results are 
presented in Supplementary Table 3, which includes some data that have been 
previously published*”’. 

shRNA sequences. The shRNA sequences used in individual experiments are 
listed in Supplementary Table 4. 

shRNA toxicity and complementation assays. shRNA toxicity was assayed as 
described”*. Briefly, the pRSMX-PG vector that co-expresses shRNA and GFP was 
transduced into lymphoma and multiple myeloma cell lines. Two days after retro- 
viral transduction, doxycycline was added to induce shRNA expression. The frac- 
tion of GEP*, shRNA-expressing cells relative to the GFP ,shRNA fraction was 
monitored over various time points by flow cytometry and plotted against the 
same GFP", shRNA-expressing fraction on day 0 of doxycycline induction. The 
reduction of the GEP*, shRNA-expressing fraction at later time points indicates 
shRNA toxicity. Complementation studies were performed in the DLBCL cell line 
that harbours the MYD88 L265P mutation. HBL1 cells were transduced with 
retroviral vectors that co-express GFP and shRNA targeting the 3’ UTR of 
MYD88 (or IRAK1). shRNA-transduced cells were subsequently infected with 
retroviruses co-expressing wild-type or mutant MYD88 (or IRAK1) coding 
regions and mouse CD8 (Lyt2). The cell fraction positive for GFP and CD8 (using 
anti-mouse CD8a, BD Pharmingen) was monitored over time by flow cytometry 
as above. TMD8 and OCI-Ly3 cells were first transduced with retroviruses co- 
expressing wild-type or mutant MYD88 (or IRAK1) coding regions and mouse 
CD8 (Lyt2) and enriched for Lyt2 expression with magnetic beads. Enriched cells 
were subsequently infected with retroviral vectors that co-express GFP and an 
shRNA targeting the 3’ UTR of MYD88. The cell fraction positive for GFP and 
CD8 was monitored over time by flow cytometry. 

For the IRAK4 complementation assay, HBL1, HLY1, TMD8 and SUDHL2 
lymphoma cell lines were first retrovirally transduced with either vLyt2 empty 
vector, or vLyt2 vector expressing wild-type or kinase-dead IRAK4 (K213A/ 
K214A). The infected cells were later enriched using the CD8 microbeads method 
(Miltenyi Biotec) according to manufacturer’s protocol. The enriched cells were 
then infected with retroviral vector pRSMX-PG expressing either a control shRNA 
or an IRAK4 shRNA. Doxycycline was added 2 days after infection to induce 
shRNA expression. The fraction of GFP-positive cells was monitored over time 
by FACS analysis and the decline of GFP hence shRNA expressing cells indicates 
toxicity. 

Synergistic toxicity of MYD88 and either CARD11 or CD79A knockdown. 
OCI-Ly10 cells were first infected with either pRSMX-puro empty vector or 
pRSMX-puro vector encoding MYD88 shRNA sequence (5’-GTACCAGT 
ATTTATACCTCTA-3’). Two days after infection the two cells lines were selected 
using 1 ugml | puromycin. The selected cells were then retrovirally infected with 
pRSMX-PG encoding either a scramble control, shRNA sequence against 
CARD11 (target sequence 5'-GGGGTGTGTACCAGGCTATGA-3’) or CD79A 
(target sequence 5’-GGGGCTTCCTTAGTCATATTC-3’). Doxycycline was 


added 2 days after infection to induce shRNA expression. The fraction of GFP- 
positive cells was monitored over time by FACS analysis and the decline of GFP 
hence shRNA expressing cells indicates toxicity. 

High-throughput RNA sequencing/PCR amplification/Sanger sequencing. 
The standard Illumina pipeline for RNA-seq was used, using paired-end 75- 
base-pair runs with each sample run in one sequencing lane, yielding ~20 million 
reads per sample. Sequences were mapped back to both RefSeq and Ensemble 
transcript models using the BWA algorithm”, yielding a median resequencing 
coverage of 10. Single nucleotide variants (SNVs) were reported that deviated 
from the human reference genome sequence, were observed in both sequencing 
directions, represented >20% of the resequencing coverage at a particular base 
pair position, and were not known SNPs in the dbSNP database of NCBI. A total of 
52,160 putative SNVs was detected in the four ABC DLBCL cell lines studied. 
Sequences have been submitted to the NCBI short sequence archive under accession 
SRP003192. On the basis of the criteria above, all SNVs that are not represented in 
publically available databases of single nucleotide polymorphisms (SNPs) are listed 
in Supplementary Table 5. Except for the MYD88 mutations, other SNVs in this 
table have not been validated by independent means. 

Sanger sequencing of MYD88 was accomplished with the following primers: 
MYD88-Full-F, 5'-GACCTCTCCAGATCTCAAAAGGCAGATTCC-3' (PCR 
amplification and sequencing, exon 1); MYD88-Full-R, 5’-GCAGAAGTACAT 
GGACAGGCAGACAGATAC-3’ (PCR amplification and sequencing, exon 5); 
MYD88-Seq-E1R, 5'-TCTCTCCATGGGAGACAGGATGCTG-3’ (sequencing 
exon 1); MYD88-Seq-E2F, 5'-TGGGTAAAGAGGTAGGCACTCCCAG-3’ (sequen- 
cing exon 2); MYD88-Seq-E2R, 5’-GCCCATCTGCTTCAAACACCCATGC-3’ 
(sequencing exon 2); MYD88-Seq-E3F, 5’-AAGCCTTCCCATGGAGCTCTGAC 
CAC-3' (sequencing exon 3); MYD88-Seq-E3R, 5’-GCTAGGAGGAGATGCCC 
AGTATCTG-3’ (sequencing exon 3); MYD88-Seq-E4F, 5’-ACTAAGTTGC 
CACAGGACCTGCAGC-3’ (sequencing exon 4); MYD88-Seq-E4R, 5'-ATCCA 
GAGGCCCCACCTACACATTC-3' (sequencing exon 4); MYD88-Seq-E5F, 
5'-GTTGTTAACCCTGGGGTTGAAG-3’ (sequencing exon 5). 

For 155 cases of ABC DLBCL, analysis of CARD11, CD79B, and A20 by Sanger 
sequencing was performed as described**”’. 

A20 was declared epigenetically silenced if the expression level in a case was 

more than 2 standard deviations below the mean of other ABC DLBCL cases, 
based on previous gene expression profiling data’. Deletion of the A20 locus 
(TNFAIP3) was analysed by quantitative PCR using primers to amplify exon 3 
and exon 6, as described”’, and compared to a reference gene, CHMP4A, that is not 
subject to copy number alterations in ABC DLBCL. A20 was declared deleted if 
one or both of the A20 PCR products had an estimated copy number that was 
more than 3 standard deviations below the average of 9 normal control DNA 
samples. The following quantitative PCR primers were used: CHMP4A-F, 
5'-CTGCAAGGGAGGAGGGGTTTCATTC-3’ (qPCR control gene); CHMP4A-R, 
5'-CITTGGGTGTTCCTTCTGGCCAGTC-3’ (qPCR control gene); A20-E3F, 
5'-ACCTTTGCTGGGTCTTACATGCAG-3' (qPCR A20); A20-E3R, 5’-TAT 
GCCCACCATGGAGCTCTGTTAG-3’ (qPCR A20); A20-E6F, 5’-TGAGATC 
TACTTACCTATGGCCTTG-3’ (qPCR A20); A20-E6R, 5'-TCAGGTGGCTGA 
GGTTAAAGACAG-3' (qPCR A20). 
Expression vectors and cDNA mutagenesis. The expression vector, vLyt2- 
MYD88-EGFP, encoding carboxy terminus EGFP-tagged MYD88 was con- 
structed by three-way ligation of PCR-generated MY D88 and EGFP products into 
the pBMN-IRES-Lyt2 vector (provided by G. Nolan). The restriction site Sall was 
included in the MYD88 PCR reverse primer and the EGFP PCR forward primer to 
facilitate the ligation between MYD88 and EGFP. MYD88 or EGFP PCR products 
were generated using the following primer pairs: 5’-TAGTAGGGATCCG 
CCGCCACCATGCGACCCGACCGCGCTGA-3' (MYD88), 5’-TAGTAGGTC 
GACGGGCAGGGACAAGGCCTTGGC-3’ (MYD88), 5’-TAGTAGGTCGACA 
TGGTGAGCAAGGGCGAGGAG-3' (EGFP), 5’-TAGTAGGCGGCCGCTTA 
CTTGTACAGCTCGTCCAT-3’ (EGFP). 

The expression vector vLyt2-AU1-MYD88 encoding amino terminus AU1- 
tagged MYD88 was constructed by inserting PCR-generated MYD88 into the 
pBMN-IRES-Lyt2 vector. MYD88 PCR product was generated using the following 
primers: 5'-TAGTAGGGATCCGCCGCCACCATGGACACATACCGCTACA 
TCCGACCCGACCGCGCTGAGGCT-3’ and 5’-TAGTAGGCGGCCGCTCAG 
GGCAGGGACAAGGCCTTGGC-3’. 

IRAK1 expression vectors were similarly created by inserting PCR-generated 
IRAK1 cDNAs (from pIND-IRAK] wild type and K239A kinase-dead templates, a 
gift from X. Li) into the pBMN-IRES-Lyt2 vector, using the following PCR pri- 
mers: 5'-TAGTAGCTCGAGGCCGCCACCATGGCCGGGGGGCCGGGC-3’ 
and 5’-TAGTAGGCGGCCGCTCACTTGTCATCGTCGTCCTTGTAGTCGC 
TCTGAAATTCATCACTTTC-3’. 

IRAK4 expression vectors were generated by inserting PCR-generated IRAK4 
cDNA (from a template obtained from the Dana-Farber/Harvard Cancer Center 
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DNA Resource Core) into the pBMN-IRES-Lyt2 vector, using the following pri- 
mers: 5'-TAGTAGGGATGGGCCGCCACCATGGACACATACCGCTACATC 
AACAAACCCATAACACCATCA-3’ and 5'-TAGTAGGCGGCCGCTCAAGA 
AGCTGTCATCTCTTGCAG-3’. 

MYD88 mutants were created with the Phusion site-directed mutagenesis kit 
(New England BioLabs), using either vLyt2-MYD88-EGFP or vLyt2-AU1- 
MYD88 vector as templates. All cDNA inserts from PCR cloning and site-directed 
mutagenesis were verified by sequencing. The MYD88 mutagenesis primers used 
were the following: L265P forward P-CATCAGAAGCGACCGATCCCCATC 
AAG and L265P reverse P-GGCACCTGGAGAGAGGCTGAGTGCAAA; 
M232T forward P-AGGT'GCCGCCGGACGGTGGTGGTTGTC and M232T 
reverse P-CTTTTCGATGAGCTCACTAGCAATAGA; S243N forward P-GAT 
TACCTGCAGAACAAGGAATGTGAC and S243N reverse PPATCAGAGACA 
ACCACCACCATCCGG; T294P forward P-ACCAACCCCTGCCCCAAATCT 
TGGTTC and T294P reverse P-GTAGTCGCAGACAGTGATGAACCTCAG; 
S222R forward P-GGTCTATTGCTAGGGAGCTCATCGAAA and S222R 
reverse P-AGACACAGGTGCCAGGCAGGACATCGC. 

The IRAK4 kinase-dead mutant was generated similarly using the following 
mutagenesis primers: K213A/K214A forward P-ACTGTGGCAGTGGCGGCG 
CTTGCAGCAATG and K213A/K214A reverse P-TGTGTTATTTACGTAGC 
CTTTATATACA. 

MYD88 co-immunoprecipitation. BJAB cells were retrovirally transduced with 
various MYD88-GFP fusion constructs co-expressing a Lyt2 surface marker. Cells 
were enriched for Lyt2 expression using anti-Lyt2 magnetic beads (Invitrogen, 
114.47D) following the manufacturer’s instructions. Enriched cells were lysed at 
10’ cells per ml in RIPA buffer (0.5% Triton X-100, 0.5% deoxycholate, 0.05% SDS, 
10 mM Tris, pH 8.0, 50 mM NaCl, 10 mM EDTA, 1 mM Na3VO,, 30 mM pyro- 
phosphate, 10mM glycerophosphate, 1 mM AEBSF, 0.02 Uml' aprotinin and 
0.01% NaN3) for 10 min on ice. Lysates were cleared by centrifuging for 20 min at 
14,000g at 4°C. MYD88-GFP constructs were immunoprecipitated with washed 
anti-GFP agarose beads (Chromotek) for 30 min at 4 °C, then washed 3-4 times in 
1X RIPA buffer. For \-phosphatase treatment, the agarose beads were washed two 
additional times in 10 mM Tris, pH 8.0 with 50 mM NaCl to remove EDTA and 
phosphates inhibitors. A-phosphatase (New England Biolabs) treatment was done 
according to the manufacturer’s instructions. Reactions were quenched by the 
addition of 2X lamellae sample buffer followed by boiling. Samples were separated 
on 10% polyacrylamide gels and transferred to Immobilon-p PVDF membranes 
(Millipore) for western blot analysis. Antibodies used for immunoblotting were 
anti-IRAK1 rabbit polyclonal (Santa Cruz Biotech), anti-IRAK4 rabbit polyclonal 
(Cell Signaling Technologies) and anti-MYD88 rabbit monoclonal (Cell Signaling 
Technologies). 

NF-«B reporter assay. BJAB cells retrovirally expressing MYD88-GFP constructs 
(see above) were transduced with lentiviral particles containing a NF-«B firefly 
luciferase reporter construct by following the manufacturer’s instructions (SA 
Biosciences). Firefly luciferase activity was measured using the Dual-Luciferase 
Reporter Assay System (Promega) following the manufacturer’s instructions. 
Luminescence from equivalent amounts of lysate was read in triplicate on a 
Microtiter Plate Luminometer (Dyn-Ex Technologies). All readings were normalized 
to the mean fluorescence intensity of MYD88-GFP expression for each MYD88 
mutant as determined by FACS analysis on a FACScalibur flow cytometer (Becton 
Dickinson). 

Western blotting. Cells were lysed in lysis buffer (50 mM Tris pH 7.4, 150 mM 
NaCl, 1% Triton X-100, 1% NP-40, 2mM EDTA) supplemented with Complete 
Protease Inhibitor Cocktail Tablets (Roche) and phosphatase inhibitors (Sigma) 
for 30 min. Lysates were cleared by centrifugation at 15,000gat 4 °C for 10 min and 
protein concentrations were determined by BCA protein assay (Pierce). 80-100 pg 
of lysates were subjected to electrophoresis through a 4-12% Bis-Tris gel 
(Invitrogen) and immobilized on the nitrocellulose membranes. Proteins were 
detected using the following antibodies: MYD88, IRAK4, -actin, STAT3 and 
p-STAT3 (¥705) (Cell Signaling Technology). 

IRAK1 immunoprecipitation. Cells were lysed at 10’ cells per ml in RIPA buffer 
as described above. Lysates were pre-cleared with protein-A agarose beads (Pierce) 
before incubation with 1 1g ml’ of anti-IRAK1 polyclonal antibody (Santa Cruz 
Biotech, H-273) for 2h on ice. Protein-A agarose beads were added to lysates and 
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rotated for 1 h at 4 °C, then washed three times with 1X RIPA buffer. A-phosphatase 
treatment was performed as described above. Samples were separated on 10% poly- 
acrylamide gels and transferred to Immobilon-p PVDF membranes (Millipore) for 
western blot analysis. 

Cytokine measurement. The culture medium of cells transduced with inducible 
shRNAs was replaced with fresh medium plus doxycycline, and the concentrations 
of IL-6, IL-10 or IFN-f in culture supernatants at the indicated times were measured 
by ELISA (R&D Systems). Alternatively, unmanipulated lymphoma cells lines were 
placed into fresh media with the addition of the IRAK1/4 inhibitor (EMD chemicals) 
and assessed for cytokines as above. The results were normalized to live cell numbers, 
and are representative of at least two independent experiments. 

Apoptosis measurements. HBL1 cells were retrovirally transduced with either 
control or MYD88-specific shRNAs, as described above. shRNA expression was 
induced with doxycline and cells were evaluated for apoptosis on 2, 3 and 4 days 
after induction. To measure apoptosis, cells were first fixed for 10 min with 1.5% 
paraformaldehyde, centrifuged and then fixed and permeabilized in cold methanol 
overnight. Methanol-fixed cells were washed three times with FACS buffer (PBS 
with 1% FBS) and stained with PE rabbit anti-active caspase 3 (BD Pharmingen) 
and Alexa Fluor 647 mouse anti-cleaved PARP (Asp 214) (BD Pharmingen) for 
20 min at room temperature in the dark, followed by an additional wash with 
FACS buffer. Stained cells were subjected to FACS analysis (FACScalibur, BD) and 
apoptotic cells were defined as double positive for both active caspase 3 and 
cleaved PARP. 

Cell viability assay by MTS. The described DLBCL and multiple myeloma cells 
lines were plated in duplicate at a density of 50,000 cells per well in 96-well plates 
along with DMSO as negative control, or different concentrations of IRAK1/4 
inhibitor (EMD Chemicals). Cell viability at 1, 2 and 3 days after drug treatment 
was assayed by adding 3-(4,5-dimethylthiazol-2-yl)-5-(3-carboxymethoxyphenyl)- 
2-(4-sulphophenyl)-2H-tetrazolium and an electron coupling reagent (phenazine 
methosulphate; Promega), incubated for 3h and measured by the amount of 
490 nm absorbance using a 96-well plate reader. The presented data were derived 
from 3 days of drug treatment. The assay was performed twice. 

MYD88 and IRAK1 signature analysis. To generate a gene expression signature 
of MYD88 signalling in ABC DLBCL, the HBLI cell line was transduced with 
retroviral vectors expressing either sh MYD88-4 or shMYD88-7. After puromycin 
selection, shRNA expression was induced for 24 or 48 h and gene expression was 
measured relative to parallel uninduced cultures using Agilent 444K oligonu- 
cleotide microarrays. A set of 284 MYD88 target genes was selected as those that 
were downregulated by 0.4 log, in =3 arrays. A signature of NF-«B signalling (NF- 
«B-10 signature; http://lymphochip.nih.gov/signaturedb/) in ABC DLBCL was 
generated by treating HBLI cells with the IkB kinase-f inhibitor MLN120B for 
2h, 3h, 4h, 6h, 8h, 12h, 16h and 24h. Genes that were downregulated >0.4 log, 
in at least four arrays with a one-sided t-test <0.01 were chosen. A signature of 
JAK signalling in ABC DLBCL (JAKUp-2 signature; http://lymphochip.nih.gov/ 
signaturedb/) was generated by treating HBL1 cells with JAK inhibitor I (5 uM; 
Calbiochem) for 2h, 4h, 6h and 8h. Genes were chosen that were decreased in 
expression by >0.4 log, at =3 time points. A signature of IFN signalling (IFN-3; 
http://lymphochip.nih.gov/signaturedb/) was curated as the union between three 
published gene expression signatures of type I interferon signalling (IFN-1, IRF3-1 
and Module-3.1 signatures; http://lymphochip.nih.gov/signaturedb/). A Fisher’s 
exact test was used to calculate the significance of the overlap between the MYD88 
signature and the other signatures. 

Similar methods were used to generate a signature of IRAK1 signalling in ABC 
DLBCL. Two ABC DLBCL cell lines, HBL1 and TMD8, were transduced with 
retroviruses expressing shIRAK1-3 or a control shRNA. After puromycin selec- 
tion, shRNA expression was induced for 24h or 48h and RNA and relative gene 
expression in shIRAK] and control shRNA-expressing cells was analysed by gene 
expression profiling as above. A signature of 350 genes was selected as those that 
were downregulated by 0.4 log, in =3 arrays. 
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Spatially asymmetric reorganization of inhibition 
establishes a motion-sensitive circuit 


Keisuke Yonehara', Kamill Balint', Masaharu Noda?*, Georg Nagel‘, Ernst Bamberg®° & Botond Roska! 


Spatial asymmetries in neural connectivity have an important role 
in creating basic building blocks of neuronal processing’”. A key 
circuit module of directionally selective (DS) retinal ganglion cells 
is a spatially asymmetric inhibitory input from starburst amacrine 
cells*°. It is not known how and when this circuit asymmetry is 
established during development. Here we photostimulate mouse 
starburst cells targeted with channelrhodopsin-2 (refs 6-8) while 
recording from a single genetically labelled type of DS cell”’°. We 
follow the spatial distribution of synaptic strengths between 
starburst and DS cells during early postnatal development before 
these neurons can respond to a physiological light stimulus, and 
confirm connectivity by monosynaptically restricted trans-synaptic 
rabies viral tracing. We show that asymmetry develops rapidly over 
a 2-day period through an intermediate state in which random 
or symmetric synaptic connections have been established. The 
development of asymmetry involves the spatially selective reorgani- 
zation of inhibitory synaptic inputs. Intriguingly, the spatial 
distribution of excitatory synaptic inputs from starburst cells is 
significantly more symmetric than that of the inhibitory inputs 
at the end of this developmental period. Our work demonstrates 
a rapid developmental switch from a symmetric to asymmetric 
input distribution for inhibition in the neural circuit of a principal 
cell. 

DS retinal ganglion cells respond to movement in a ‘preferred’ 
direction with robust spiking, but show minimal response to move- 
ment in the opposite, or ‘null’, direction®’’'’. DS cells receive 
GABAergic inhibitory inputs from starburst amacrine cell processes 
pointing in the null direction, but not from those pointing in the 
preferred direction*”*. Glutamatergic excitatory input from bipolar 
cells is also directionally selective. Interestingly, starburst cells also 
communicate to DS cells using acetylcholine’*’’, but this excitatory 
connection seems to be symmetric’. Directional selectivity is present 
before eye opening (around postnatal day 13 (P13) in mice), as well as 
in dark-reared animals”'”-””, indicating that the establishment of cir- 
cuit asymmetry does not require visual experience. How and when 
such highly specific synaptic connections are established between star- 
burst and DS cells during development remain unknown. Retinal cells 
do not respond to light until P10-11 in mice’*”, with the exception of 
melanopsin-containing ganglion cells’, which limits the ability to 
follow the early development of functional connectivity. Directional 
selectivity may develop by the asymmetric refinement of previously 
formed inhibitory connections (Supplementary Fig. la) or, alterna- 
tively, the inhibitory synaptic inputs form asymmetrically (Supplemen- 
tary Fig. 1b). 

To distinguish between these possibilities, we probed the spatial dis- 
tribution of synaptic strengths from starburst amacrine cells to individual 
ON DS cells during postnatal development. ON DS cells respond to slow 
movement and are critical for mediating the optokinetic reflex”*. In 
SPIG1-GFP (SPIG1, also known as Fstl4, locus driving green fluorescent 


protein expression) knock-in mice, upward-motion-preferring ON DS 
cells are selectively labelled with GFP throughout development in most 
retinal regions”"®. 

To activate the starburst cells of SPIG1-GFP mice before amacrine and 
ganglion cells receive light-driven inputs from bipolar cells, this mouse 
line was crossed with another line expressing Cre recombinase speci- 
fically in starburst cells (choline acetyltransferase (ChAT)-Cre knock- 
in mice)”. At PO, we transduced these SPIGI-GFP X ChAT-Cre mice 
with a Cre-recombinase-dependent adeno-associated virus (AAV) car- 
rying a reversed and double-floxed C128T mutant channelrhodopsin-2 
(ChR2c)** followed by 2A-DsRed2 (ChR2c-2A-DsRed2, see Methods, 
Fig. 1a). 2A sequence codes for a cis-acting hydrolase element” that 
creates equimolar amounts of ChR2c and red-fluorescent, soluble 
DsRed2. A soluble marker in the cell body allowed easier quantification 
of fluorescence, and therefore ChR2 expression, than in a membrane- 
bound fusion construct. ChR2c-expressing cells are responsive to light at 
an intensity 50-fold lower than cells expressing wild-type ChR2 (ref. 7) 
and could, therefore, be activated by light patterns generated by an over- 
head projector. 

Immunohistochemistry showed that all DsRed2-marked neurons 
were also positive for ChAT, a marker for starburst cells. Conversely, a 
substantial fraction (~60%) of starburst cells in both the ganglion cell 
layer (GCL) and inner nuclear layer (INL) (Fig. 1b, c) were DsRed2- 
labelled. Therefore, starburst cells, but no other cell type in these mice, 
are labelled red and express light-sensitive ChR2c, whereas upward- 
motion-preferring ON DS cells are labelled green (Fig. le). First, we 
characterized the light-excitability of ChR2c-positive starburst cells in 
intact, isolated retinas between P6 and P9. Light illumination evoked 
robust currents in DsRed2-expressing cells, even in the presence of 
glutamatergic synaptic blockers (CPP, NBQX, APB), suggesting 
ChR2c as the source of the currents (Fig. 1d). Increasing illumination 
evoked increasing membrane potential changes in starburst cells and, 
as expected due to the 2A element, the red fluorescence intensity of the 
recorded cell bodies correlated well (R = 0.83) with the magnitude of 
the membrane potential change at the stimulation intensity which is 
used to test the distribution of synaptic strengths in subsequent experi- 
ments (Supplementary Fig. 2). 

To test whether the genetically tagged neural circuit could report the 
synaptic strengths from starburst to ON DS cells, we isolated excitatory 
and inhibitory inputs to ON DS cells at P8 while stimulating ChR2c- 
expressing starburst cells with light patterns (see Methods). A full-field 
light step elicited both inhibitory and excitatory currents in ON DS cells 
(Fig. 1f and Supplementary Fig. 3). The inhibitory component was 
blocked by the GABA receptor antagonist picrotoxin, and the excitatory 
input by the cholinergic receptor antagonist curare. Blocking glutamate 
receptors had no effect on the light-evoked currents (Fig. 1f) or on the 
miniature excitatory postsynaptic currents (mEPSC, Supplementary 
Fig. 4). mEPSCs were blocked by curare (Supplementary Fig. 4). 
These results confirmed that ON DS cells receive GABAergic and 
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Figure 1 | Targeting of ChR2c to starburst amacrine cells at P8. a, AAV 
vector. EFla, promoter; ITR, inverted terminal repeat; WPRE, woodchuck 
post-transcriptional regulatory element. b, Confocal images from an AAV- 
transduced retina. c, Relationship between DsRed2-expressing and ChAT- 
positive cells. d, Top, excitatory currents inan AAV-labelled starburst cell in the 
presence of synaptic blockers. Full-field flash stimulus. Bottom, confocal image 
of the recorded starburst cell. e, Top, confocal image ofa retina in which ON DS 
cells are expressing GFP (green) and starburst amacrine cells are expressing 
ChR2c and DsRed2 (magenta). Bottom, side-view. IPL, inner plexiform layer. 
f, Synaptic currents recorded at —60 mV (red) and 20 mV (blue) holding 
potentials from a GFP-positive ON DS cell in response to a full-field flash. Error 
bars, s.d. 


cholinergic synaptic inputs in response to starburst cell stimulation, but 
do not receive glutamatergic synaptic input from bipolar cells at this 
stage. 

The ChR2c-assisted synaptic strength mapping depends on direct 
connections between starburst and ON DS cells during early postnatal 
development. To test whether this is the case, we performed mono- 
synaptically restricted retrograde synaptic tracing with G-deleted rabies 
virus’’** complemented with G-expressing herpes virus initiated from 
GFP-labelled ON DS cells (see Methods, Supplementary Figs 5 and 6). 
At P6, starburst cells were rabies-labelled around infected GFP-marked 
ON DS cells (Fig. 2), indicating that starburst cells are directly con- 
nected to ON DS cells at this developmental stage. 

Having confirmed monosynaptic connection from starburst cells 
already at P6, we investigated the spatial distribution of the strength 
of synaptic connections by stimulating starburst cells with light steps in 
eight sectors surrounding the recorded ON DS cells (Fig. 3). We calcu- 
lated a spatial asymmetry index (SAI) that quantified the degree of 
spatial asymmetry of the synaptic inputs to ON DS cells along the 
dorso-ventral axis (see Methods). We found that the inhibitory input 
was already spatially asymmetric along the dorso-ventral axis by P8; 
stimulation of the ventral (null) side evoked more inhibitory current 
than stimulation of the dorsal (preferred) side. In contrast, the excitatory 
input was significantly more symmetric along the same axis (Fig. 3 and 
Supplementary Fig. 7). To avoid potential bias due to non-uniform viral 
transduction we normalized the synaptic currents with either the num- 
ber of DsRed2-expressing starburst cells (using a threshold) or the sum 
of the measured red fluorescence (which reflects the voltage change in 
starburst cells, as shown before, Supplementary Fig. 2) of the starburst 
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Figure 2 | Monosynaptically restricted circuit mapping initiated from ON 
DS cells. a, b, Live images of SPIG1-GFP retinas at P6 in which GFP-labelled 
ON DS cells (green) were infected with G-deleted rabies expressing mCherry 
(magenta) either alone (a) or in combination with G-encoding herpes virus 
(b). c, Confocal images of a GFP-labelled ON DS cell (cyan) infected with 
G-deleted rabies virus only (magenta). d, Confocal images of an ON DS cell 
infected with both G-deleted rabies virus and G-encoding herpes virus from 
b. Most of the labelled presynaptic cells (arrows) are ChAT-positive starburst 
cells (yellow) in the GCL. 


cells in each of the eight sectors in which the light stimulus was pre- 
sented (see Methods). The normalized responses, like the recorded raw 
responses, also showed asymmetric inhibition and more symmetric 
excitation along the dorso-ventral axis at P8 (Fig. 3, Supplementary 
Figs 7 and 8). In contrast to P8, the raw and normalized inhibition 
and excitation at P6 was symmetric along the same axis (Fig. 3, Sup- 
plementary Figs 7 and 8). The lack of asymmetry in inhibition at P6 was 
not due to ineffective activation of starburst cells because of low ChR2c 
expression level, since half-maximal activation at P9, which should be 
similar to maximal activation at P6 in terms of eliciting changes of 
membrane potential (Supplementary Fig. 2), revealed asymmetry (Sup- 
plementary Fig. 9) . 

Next, we investigated the emergence of asymmetry from P6 to P9 
(Fig. 4). SAI of inhibition increased significantly, but there was no 
significant change in excitation between any pairs of days (Supplemen- 
tary Fig. 8, note: the lack of stars between pairs of conditions on any of 
the figures means no significant change). The lack of statistically sig- 
nificant change in excitation was not due to saturating intensities 
because half-maximal activation of excitatory inputs to ON DS cells 
did not significantly change the SAI of excitation at P9 (Supplementary 
Fig. 9). The mean direction of inhibitory input of individual recorded 
cells, computed as the vector sum of inputs for all eight directions, was 
random at P6 but became confined to the ventral side by P8 (direction 
of red bars in Fig. 4h). Because the variation in DsRed2 expression 
across the eight sectors was not statistically different between P6 and 
P9 (Supplementary Fig. 10), the randomness at P6 is not due to greater 
variation in gene expression from AAV at earlier time points. We con- 
clude that before P6 the spatial distribution of inhibitory connectivity 
between starburst cells and ON DS cells is either random or symmetric 
(with some synapses having little strength). Inhibitory connectivity 
rapidly reorganizes to become asymmetric along the ‘preferred-null’ 
(dorso-ventral) axis between P6 and P8, whereas excitatory cholinergic 
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Figure 3 | ChR2c-assisted circuit mapping at P8 and P6. a-f, Recordings 
from ON DS cells at P8 (a-c) and P6 (d-f). Inhibitory (a, d) and excitatory 
(b, e) postsynaptic currents elicited in an ON DS cell by the stimulation of eight 


input remains significantly more symmetric throughout this develop- 
mental period (Supplementary Fig. 1a). 

Is inhibitory connectivity strengthened at the ventral (‘null’) and 
weakened at the dorsal (‘preferred’) side or is only one of these two 
mechanisms driving the development of asymmetry? Since starburst 
cells similarly control the strength of cholinergic excitation and 
GABAergic inhibition to ON DS cells (Supplementary Fig. 11) and 
excitation is not significantly different along the dorso-ventral axis 
from P6 to P9, the ratio of inhibition to excitation (neither normalized) 
in the dorsal and ventral sides should be a measure of the inhibitory 
synaptic strength that depends less on the level of ChR2c expression 
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Figure 4 | Development of asymmetry. Coloured lines indicate normalized 
current responses from individual cells and the black line indicates the mean 
response of all recorded cells (or the mean cell number for a). Polar plot of the 
number of DsRed2-expressing cells (a), excitatory (b) and inhibitory (c) inputs, 
excitatory (d) and inhibitory (e) inputs normalized to the number of DsRed2 


sectors surrounding the cell. Polar plots are also shown. c, f, Spatial asymmetry 
index (SAI) for inhibition and excitation. g, Sketch of light patterns used to 
stimulate one of eight sectors around the recorded ON DS cell. Error bars, s.d. 


(which increases over the days, Supplementary Fig. 2d) than inhibition 
alone. This ratio increased in the ventral (though the increase was not 
significant) and decreased in the dorsal side (Supplementary Fig. 12), 
suggesting that ‘push-pull’ synapse reorganization is at work. 

The retinal stratum in which ON DS cells extend their dendrites 
embodies three different directionally selective computations that lead 
to preferential responses to nasal, upward and downward motion in 
different types of ON DS cells. We suggest that, in the physical space 
shared by these circuits, it is the spatially selective refinement of the 
distribution of inhibitory input strength to each DS ganglion cell that 
underlies the establishment of each directionally selective retinal circuit. 
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expressing cells in each sector, excitatory (f) and inhibitory (g) inputs 
normalized to the mean DsRed2 fluorescence intensity of cells in each sector. 
h, Red bars indicate the vector sum of inhibitory inputs normalized to the 
number of DsRed2 expressing cells in each sector. 
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Higher-order brain computations, for example orientation selectivity in 
the visual cortex, also rely on spatial circuit asymmetries. Mechanistic 
insights from the development of retinal directional selectivity may help 
to understand how asymmetry in cortical circuits is established. 


METHODS SUMMARY 


On the day of birth (PO) starburst amacrine cells in the progeny of a cross between 
mice expressing Cre in starburst amacrine cells and SPIG1-GFP mice expressing 
GFP in ON DS cells were labelled with ChR2c in vivo by transduction with a Cre- 
recombinase-dependent AAV. Retinas were isolated at P6 or later and GFP- 
labelled ON DS cells were recorded in voltage clamp at —60 mV (for excitation) 
or 20 mV (for inhibition)” guided by a two-photon microscope at 930 nm (ref. 30). 
Photostimulation was performed with white light steps (duration 2 s, inter-stimulus 
interval 10 s) generated by a digital light projector (PLUS Vision) and directed onto 
each of eight sectors surrounding the recorded cell. Stereotaxic injections of rabies 
and herpes viruses to the medial terminal nucleus (MTN) of SPIGI-GFP mice were 
performed at P1 and infected retinas were isolated at P6. 


Full Methods and any associated references are available in the online version of 
the paper at www.nature.com/nature. 
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METHODS 

Animals. ChAT-Cre mice were purchased from Jackson Laboratory (strain: 
B6;12986-Chat!”™!(""])_ In ChAT-Cre mice, Cre recombinase is expressed 
under the control of the ChAT locus. In SPIGI-GFP mice GFP is expressed under 
the control of the SPIG1 locus””®. To obtain neonatal SPIG1-GFP x ChAT-Cre 
mice we crossed SPIGI-GFP homozygous mice with ChAT-Cre homozygous 
mice. All animal procedures were performed in accordance with standard ethical 
guidelines (European Communities Guidelines on the Care and Use of Laboratory 
Animals, 86/609/EEC) and were approved by the Veterinary Department of the 
Canton of Basel-Stadt, Switzerland. 

AAV plasmids. In the present paper, we refer to the C128T mutant ChR2 (refs 
6-8) as ChR2c. To obtain pAAV-EFla-double floxed-ChR2c-2A-DsRed2- 
WPRE-hGHpA we linearized pAAV-EFla-double floxed-hChR2(H134R)- 
EYFP-WPRE-hGHpA (provided by K. Deisseroth) using Nhel/AscI. ChR2c was 
PCR-amplified from pGEMHE-ChR2c using a HindIII-2A-covering primer. 
Forward primer: 5’-GCTAGCGCTAGCCACCATGGATTATGGAGGCGCCC 
TG-3’. Reverse primer: 5’-TCTCCCGCAAGCTTAAGAAGGTCAAAATTCIT 
GCCGGTGCCCTTGTTGAC-3’. DsRed2 was PCR-amplified from pDsRed2-N1 
(Clontech) using a HindIII-2A-covering primer. Forward primer: 5’-ACCTTC 
TTAAGCTTGCGGGAGACGTCGAGTCCAACCCTGGGCCCATGGCCTCCT 
CCGAGAACGTC-3’. Reverse primer: 5’-GGCGCGCCGGCGCGCCCTATC 
ACAGGAACAGGTGGTGGCG-3’. These two PCR products were then digested 
with Nhel, AscI and HindIII and triple ligation was performed. 

AAV production. Serotype 7 recombinant AAVs were made by Penn Vector 
Core. Penn Vector Core performed the genome copy (GC) number titration (titre: 
5.78 10'GC per ml) using real-time PCR (TaqMan reagents, Applied 
Biosystems). 

Logic of AAV labelling. We used the viral vector AAV EFla-double floxed- 
ChR2c-2A-DsRed? to target the expression of ChR2c to starburst cells expressing 
Cre. In AAV EFla-double floxed-ChR2c-2A-DsRed2, two incompatible loxP 
variants*', loxP and lox2722, flank an inverted version of ChR2c followed by the 
red fluorescent marker DsRed2. In the presence of Cre, a stochastic recombination 
of either loxP variant takes place, resulting in the inversion of ChR2c-2A—DsRed2 
into the sense direction, followed by expression of the ChR2c-2A-DsRed2. The 2A 
element*” codes for a cis-acting hydrolase element” that creates equimolar 
amounts of ChR2c and red-fluorescent DsRed2. 

AAV injection. We injected the virus at the day of birth. SPIGI-GFP  ChAT- 
Cre mice were anesthetized with crushed ice. Virus (1.5 ul, 8.68 X 10° GC) was 
loaded into pulled glass pipettes (tip diameter, 30 1m) and injected into the vitreal 
space of both eyes using a microinjector (Narishige). After 6 days, DsRed2 expres- 
sion was brightly detectable; hence all recordings were performed on retinas at P6 
or later. 

Preparation of retinas. Neonatal mice were killed by decapitation. Eyes were 
enucleated. The retinas were isolated and the pigment epithelium removed in 
Ringer’s medium (in mM: 110 NaCl, 2.5 KCI, 1 CaCh, 1.6 MgCh, 10 p-glucose, 
22 NaHCOs, bubbled with 5% CO2/95% O2, pH 7.4) and mounted ganglion-cell- 
side up on a filter (MF-membrane, Millipore) with a 2-mm rectangular aperture in 
the centre. Before starting superfusion, DsRed2-positive regions together with 
GFP-positive cells were located with an epifluorescence stereomicroscope 
(Olympus) and photographed for later determination of the average DsRed2 
fluorescence intensity around recorded ON DS cells for data normalization and 
orientation of the retina. Only GFP cells surrounded by DsRed2 expression were 
chosen for the recordings. The orientation of the isolated SPIGI-GFP X ChAT- 
Cre retina was determined by the pattern of GFP expression. In most retinal 
regions, GFP is expressed exclusively in one type of ON DS cells during the 
developmental period. An exception is the dorsal (slightly temporal) region, where 
GFP is expressed in many different ganglion and amacrine cell types. A thick axon 
bundle in this region runs in the dorso-ventral direction towards the optic disk and 
can be used as a compass in isolated retinas’. The retinas were superfused in 
Ringer’s medium at 35-36°C in the microscope chamber for the duration of 
the experiment. In this retinal preparation, ChR2c-mediated light responses could 
be measured for more than 8h. 

Two-photon imaging, electrophysiology and pharmacology. Fluorescent cells 
were found with the help of a two-photon microscope equipped with a Mai Tai HP 
two-photon laser (930 or 1,010 nm) (Spectra Physics) integrated into the electro- 
physiological setup*’. Current recordings were made in whole-cell voltage clamp 
mode using an Axon Multiclamp 700B amplifier with borosilicate glass electrodes 
(BF100-50-10, Sutter Instruments) pulled to 7-9 MQ, and filled with (in mM) 
112.5 CsCH3SO3, 1 MgSO, 7.8 X 10 *CaCh, 0.5 BAPTA, 10 HEPES, 4 ATP-Nap, 
0.5 GTP-Na3, 5 lidocaine N-ethylbromide (Qx314-Br), 7.75 neurobiotin chloride, 
pH7.2. Excitatory and inhibitory synaptic currents (‘excitation’ and ‘inhibition’, 
respectively) were separated by voltage-clamping the cell to the equilibrium potential 
of chloride (—60 mV) and unselective cation channels (20 mV), respectively”. For 
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recording mEPSCs, cells were voltage-clamped at —60mV and recorded for 
3-5 min. Voltage recordings from DsRed2-positive starburst cells were made in 
whole-cell current clamp mode with glass electrodes pulled to 7-9 MQ and filled 
with (in mM) 115 K-gluconate, 9.7 KCl, 1 MgCl, 0.5 CaCl, 1.5 EGTA, 10 HEPES, 4 
ATP-Na,, 0.5 GTP-Na3, pH 7.2. In pharmacological experiments, agents were 
bath-applied at the following concentrations: 10 4M CPP ((+)-3-(2-carboxypiperazin- 
4-yl) propyl-1-phosphonic acid, blocking NMDA receptors), 101M NBQX 
(6-nitro-2,3-dioxo-1,4-dihydrobenzo|f] quinoxaline-7-sulfonamide, blocking AMPA 
and kainate receptors), 10 11M APB (L-(+)-2-amino-4-phosphonobutyric acid, 
blocking metabotropic glutamate receptors and therefore blocking the ON 
pathway), 50 uM curare (tubocurarine chloride, blocking nicotinic acetylcholine 
receptors), 100 uM picrotoxin (blocking GABA A and C receptors). All chemicals 
were obtained from Sigma, with the exception of APB (Calbiochem). Data were 
analysed offline with Mathematica (Wolfram Research). 

Photostimulation. ChR2c was activated with light generated by a digital light 
projector (V-332, PLUS Vision). The stimulation was generated via custom-made 
software (Matlab, Mathworks; Labview, National Instruments). Light intensity 
was measured using a spectrometer (USB 4000, Ocean Optics) calibrated with a 
reference source (LS1-Cal, Ocean Optics). Light intensity was modulated by using 
different grey scales (0-255) combined with different neutral density filters (NDO, 
ND10, ND20 and ND 30). To correlate the stimulus intensity and the change in 
membrane potential in ChR2c-expressing starburst cells (Supplementary Fig. 2), 
we used full-field flash at 24 different intensities (12.08-15.86 in Log intensity 
photons cm~* s~') presented for 2 s with an inter-stimulus interval of 5 s. For the 
stimulation of the eight sectors (Figs 3, 4), each stimulus was presented for 2 s with 
an inter-stimulus interval of 10s. The eight sectors were stimulated in a random 
order. The light pattern was focused on the GCL. To find a ‘weak light intensity’ 
that evoked half the maximum excitatory current input to ON DS cell (Sup- 
plementary Fig. 9), full-field flash at 24 different intensities (12.08-15.86 in Log 
intensity photons cm * s” ') was presented sequentially (presentation for 2 s with 
an inter-stimulus interval of 5 s) initially and next the retinas were then stimulated 
in eight sectors using the determined light intensity. 

Data collection and analysis. Light stimulation of each of the eight sectors (Fig. 3g) 
around the recorded ON DS cell body was repeated in each recorded ON DS cell 
3-10 times for both excitation and inhibition and the mean light responses were 
determined for all eight directions. To correct for non-uniform viral expressions in 
the eight sectors, we performed two different types of normalizations. In the first 
procedure, we normalized the current evoked in each sector by the number of 
DsRed2-expressing cells in the sector within 200 jum of recorded ON DS cell bodies 
in the GCL. Here we used an arbitrary fluorescence threshold that was constant 
between experiments. We choose the particular distance of 200 jim because the 
radius of the dendrites of ON DS cells plus the radius of the processes of starburst 
amacrine cells at P6 and P9 were together less than 200 um (data not shown). 

In the second procedure, we normalized each sector to the average fluorescence 
intensity of the starburst cells in the sector and not just the number. This was 
reasonable because the fluorescence intensity of the starburst cells correlated well 
(R = 0.83) with the magnitude of the voltage response of the starburst cell at the 
stimulation intensity used for the mapping procedure (Supplementary Fig. 2): 
therefore, the average fluorescence is a measure of the stimulation strength of 
the starburst cells in the sector. 

To yield the eight quantities plotted on polar plots, these normalized (or not 
normalized) values were further normalized to the largest of the eight numbers 
(for excitation and inhibition, independently). Note that this normalization is 
useful for eliminating variations in synaptic currents arising from the patch-clamp 
technique including series resistance and leak conductance. Note that the largest 
value (of the eight) to which normalization is performed is a mean of a distribution 
since each segment was stimulated 3-10 times (see above). The direction of the 
tuning was determined by multiplying the eight values above with the correspond- 
ing unit vectors pointing in eight directions and then forming the vector sum. The 
direction of the vector sum was interpreted as the direction of tuning. 

The spatial asymmetry index (SAI) was calculated as: 

SAT = (Iventrat ~ I dorsat)/( ventrat + J dorsal) 

where Jventral aNd Igorsai are the amplitudes of the normalized or not normalized 
currents evoked by the stimulation of ventral or dorsal sectors (both normalized 
and not normalized SAIs are shown in Fig. 3 and Supplementary Figs 7-9). 

Monosynaptic restriction of circuit tracing. To create rabies G-expressing 
replication-defective herpes simplex virus-1 (HSV 1), the GFP open reading frame 
(ORF) in the HSV1 vector pR19EFla-GFP-WCm (Biovex) was replaced with that 
of G. First, the G ORF was amplified by PCR from pHCMV-G* using primers 
5'-GTGTCGTGAGGAATTCGTACCGGATCCTCTAGGCCACC-3’ and 5’-CC 
GCTTTACTTGTACATTACAGTCTGGTCTCACCCCCACT-3’. The GFP ORF 
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was removed from pR19EFla-GFP-WCm by EcoRI/BsrGI digestion and the PCR 
fragment of G was recombined into EcoRI-BsrGI site using an in-fusion PCR kit 
(Takara-Clontech). The viral particles were produced by Biovex. G-deleted rabies 
virus encoding mCherry (SADAG-mCherry)* was a gift from E. Callaway. Rabies 
virus expressing mCherry instead of the G glycoprotein was harvested from BHK- 
B19G cells (provided by E. Callaway) and centrifuged as described earlier”. We 
performed stereotaxic surgery to label ganglion cells projecting to the medial 
terminal nucleus (MTN). A cocktail of 10° plaque-forming units of rabies virus 
and 6 X 10* plaque-forming units of HSV1 in 20nl DMEM were loaded into 
pulled-glass pipettes (tip inner diameter of 20-30 1m) and injected into the 
MTN using a microinjector (Narishige, IM-9B). For control experiments, we 
injected 2 X 10° plaque-forming units of rabies virus. Injections were performed 
with 24 mice at P1. Retinas were isolated at P6. Six well infected retinas at P6 were 
fixed by PFA and stained with antibodies. Brains were also isolated and the 
injection sites were localized. All rabies and HSV1 work was carried out under 
Biosafety level 2 conditions. 

The key point of viral tracing was to infect with rabies and herpes viruses an 
upward-motion sensitive ON DS ganglion cell to initiate the retrograde passage of 
rabies viruses from this cell type. Because GFP exclusively labelled upward-motion 
sensitive ON DS cells in the ventral retina of SPIG1-GFP mice it was enough to 
examine the rabies-labelled cells around a GFP-labelled ganglion cell regardless of 
the injection site. The reason we performed the viral tracing in the SPIG1-GFP line 
was to have an internal control for the ganglion cell type for which we examine its 
local circuit. The fact that rabies did label GFP cells (in red) shows that the 
injection reached the MTN (because SPIGI-GFP cells exclusively target 
MTN”"®, see also Supplementary Fig. 6). Even if by mistake we had injected these 
viruses also to other retinorecipient brain regions, this would not compromise our 
tracing results of the GFP-labelled ganglion cells provided, first, that the rabies- 
labelled circuits in the retina were far away so that the ganglion cell to which an 
amacrine cell is presynaptic to could be determined and, second, that only one 
ganglion cell was labelled in that local circuit. The reason we used low rabies titres 
for the herpes/rabies co-injections was to make sure that the circuits analysed were 
far away from each other in the retina (see Fig. 2b). In all circuits we analysed there 
was only one ganglion cell in it (which was GFP-labelled, see Fig. 2d). The defini- 
tion of ‘ganglion cell’ was based on the existence of an axon. 
Immunohistochemistry. After the experiments, retinas were fixed for 30 min in 
4% (w/v) paraformaldehyde in PBS (137mM NaCl, 2.7mM KCl, 4.3mM 
Na,HPOy,, 1.47 mM KH,PO,, pH 7.4) and washed with PBS for at least 1 day at 
4°C. To aid penetration of the antibodies, retinas were frozen and thawed three 
times after cryoprotection with 30% (w/v) sucrose. All other procedures were 
carried out at room temperature. After washing in PBS, retinas were blocked for 
1 hin 10% (w/v) normal donkey serum (NDS; Chemicon), 1% (w/v) bovine serum 
albumin (BSA), and 0.5% (v/v) Triton X-100 in PBS. Primary antibodies were 
incubated for 7 days in 3% (v/v) NDS, 1% (w/v) BSA, 0.02% (w/v) sodium azide 
and 0.5% (v/v) Triton X-100 in PBS. Secondary antibodies were incubated for 2h 
in 3% (v/v) NDS, 1% (w/v) BSA, and 0.5% (v/v) Triton X-100 in PBS together with 


streptavidin-Alexa Fluor 633 (Invitrogen, 1:200) and DAPI (4’,6-diamidine-2- 
phenylindole dihydrochloride, Roche Diagnostics, 10 1gml~') in some experi- 
ments. Streptavidin binds to neurobiotin and thus labels neurobiotin-filled cells; 
DAPI binds to DNA and therefore labels nuclei. After a final wash in PBS, retinas 
were embedded in Prolong Gold antifade (Invitrogen). 

The following set of primary and secondary antibodies combinations were used 
in experiments in which we recorded from SPIGI cells while stimulating ChR2c- 
2A-DsRed2-expressing starburst cells: (1) Primary: goat anti-ChAT (1:200, 
AB144P, Chemicon). Secondary: donkey anti-goat IgG conjugated with Alexa 
Fluor 405 (1:200, Invitrogen). (2) Primary: rabbit anti-red fluorescent protein 
(REP; 1:200, AB3216, Chemicon). This primary antibody binds to DsRed2. 
Secondary: donkey anti-rabbit IgG conjugated with Cy3 (1:200, Jackson). (3) 
Primary: rat anti-GFP (1:500, 04404-84, Nacalai). Secondary: donkey anti-rat IgG 
conjugated with Alexa Fluor 488 (1:200, Invitrogen). The following set of primary 
and secondary antibodies combinations were used for staining rabies virus-infected 
retinas: (1) Primary: goat anti-ChAT (1:200, AB144P, Chemicon). Secondary: 
donkey anti-goat IgG conjugated with Alexa Fluor 633 (1:200, Invitrogen). (2) 
Primary: rabbit anti-red fluorescent protein (RFP; 1:200, AB3216, Chemicon). 
Secondary: donkey anti-rabbit IgG conjugated with Cy3 (1:200, Jackson). (3) 
Primary: rat anti-GFP (1:500, 04404-84, Nacalai). Secondary: donkey anti-rat IgG 
conjugated with Alexa Fluor 488 (1:200, Invitrogen). 

Confocal analysis. Stained retinas were analysed with a Zeiss LSM 700 confocal 
microscope. The DsRed2-expressing cell number was assessed from z-stack 
images by using a X20 lens, numerical aperture (NA) 0.7, X0.5 digital zoom. 
All images were recorded at the same laser power and gain control. Overall 
morphologies of the recorded starburst or ganglion cells were assessed using a 
X40 oil immersion lens, NA 1.2, X0.5 digital zoom or X63 oil immersion lens, NA 
1.3, X0.5 digital zoom. The mCherry-labelled presynaptic circuits of ON DS cells 
were assessed from z-stack images using a X63 oil immersion lens, NA 1.3. 
Statistical analysis. The non-parametric Mann-Whitney U test was used to 
compare data obtained from different cells on different days and the Wilcoxon 
signed rank test for comparing pairs of data where each pair was obtained from the 
same cell (excitation and inhibition). Significance is denoted by * for P< 0.05 and 
** for P< 0.01. The error bars and + values represent standard deviations (s.d.). 
On each figure, the lack of stars between any pairs of data signifies P > 0.05 and, 
therefore, that the two distributions are not statistically different. 
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Cortical representations of olfactory 
input by trans-synaptic tracing 


Kazunari Miyamichi’, Fernando Amat’, Farshid Moussavi’, Chen Wang’, lan Wickersham*+, Nicholas R. Wall*, Hiroki Taniguchi”, 
Bosiljka Tasic!, Z. Josh Huang®, Zhigang He’, Edward M. Callaway*, Mark A. Horowitz? & Liqun Luo! 


In the mouse, each class of olfactory receptor neurons expressing a given odorant receptor has convergent axonal 
projections to two specific glomeruli in the olfactory bulb, thereby creating an odour map. However, it is unclear 
how this map is represented in the olfactory cortex. Here we combine rabies-virus-dependent retrograde 
mono-trans-synaptic labelling with genetics to control the location, number and type of ‘starter’ cortical neurons, 
from which we trace their presynaptic neurons. We find that individual cortical neurons receive input from multiple 
mitral cells representing broadly distributed glomeruli. Different cortical areas represent the olfactory bulb input 
differently. For example, the cortical amygdala preferentially receives dorsal olfactory bulb input, whereas the 
piriform cortex samples the whole olfactory bulb without obvious bias. These differences probably reflect different 
functions of these cortical areas in mediating innate odour preference or associative memory. The trans-synaptic 
labelling method described here should be widely applicable to mapping connections throughout the mouse nervous 


system. 


The functions of mammalian brains are based on the activity patterns 
of large numbers of interconnected neurons that form information 
processing circuits. Neural circuits consist of local connections— 
where pre- and postsynaptic partners reside within the same brain 
area—and long-distance connections, which link different areas. 
Local connections can be predicted by axon and dendrite reconstruc- 
tions', and confirmed by physiological recording and stimulation 
methods’. Long-distance connections are more difficult to map, as 
commonly used methods can only trace bulk projections with a coarse 
resolution. Most methods cannot distinguish axons in passing from 
those that form synapses, or pinpoint the neuronal types to which 
connections are made’’. Trans-synaptic tracers can potentially over- 
come these limitations. Here we combine a retrograde rabies-virus- 
dependent mono-trans-synaptic labelling technique’ with genetic 
control of the location, number and cell type of ‘starter’ neurons to 
trace their presynaptic partners. We systematically mapped long- 
distance connections between the first olfactory processing centre, 
the olfactory bulb, and its postsynaptic targets in the olfactory cortex 
including the anterior olfactory nucleus (AON), piriform cortex and 
amygdala (Supplementary Fig. 1). 


Genetic control of trans-synaptic tracing 


Rabies virus can cross synapses from postsynaptic to presynaptic 
neurons with high specificity*, without notable defects in the mor- 
phology or physiology of infected neurons for extended periods of 
time*’. Recent genetic modifications of rabies virus have permitted 
mono-trans-synaptic labelling’. Specifically, the rabies envelope gly- 
coprotein (G) required for viral spread was replaced with a fluorescent 
marker®. Further, the virus was pseudotyped with EnvA, an avian 
virus envelope protein that lacks an endogenous receptor in mam- 
mals, and thus cannot infect wild-type mammalian cells. However, it 
can infect cells expressing the EnvA receptor TVA, and can subse- 
quently produce infectious particles if TVA-expressing cells also 


express G to complement the AG rabies virus (Fig. 1a, bottom). The 
new viral particles can cross synapses to label presynaptic partners of 
starter neurons. As trans-synaptically infected neurons do not express 
G, the modified virus cannot spread from them to other neurons. 
Paired recordings in cultured brain slices support the efficacy and 
specificity of this strategy’. 

To extend this method to a limited number of starter cells of a 
defined type and at a precise location in vivo, we combined mouse 
genetics and viral infections (Fig. 1a, b). We created a transgenic mouse 
(CAG-stop-tTA2) that conditionally expresses the tetracycline trans- 
activator tT A2 under the control of a ubiquitous CAG promoter only 
upon Cre-mediated excision of a transcriptional stop cassette. After 
crossing these mice with transgenic mice expressing the tamoxifen- 
inducible Cre (CreER), a small fraction of CreER* cells also express 
tT A2 following tamoxifen induction. We then used stereotactic injec- 
tions to deliver into specific regions of the brain an adeno-associated 
virus (AAV) serotype 2 expressing three proteins: histone-GFP, TVA 
and G, under the control of a tetracycline-response element (TRE). 
Expression of TVA and G allows infected, tT!'A2* cells to be receptive 
to infection by the modified rabies virus, which we injected into the 
same location two weeks later. We define starter cells as those infected 
by both AAV and rabies virus, and therefore labelled by both histone- 
GFP and mCherry; their presynaptic partners are infected only by 
rabies virus and therefore express only mCherry. 

We tested our strategy by using a ubiquitously expressing actin- 
CreER’ in combination with CAG-stop-tTA2 in the neocortex. Starter 
cells could be unambiguously identified by histone-GFP expression 
(Supplementary Fig. 2). In all but one case, we observed more than 
one starter cell (Supplementary Fig. 3 shows the example of a single 
starter cell). In a typical example, 35 starter cells in the motor cortex 
expressed. histone-GFP and mCherry (Fig. 1c (3)), demonstrating 
that AAV and rabies virus can infect the same cells in vivo. In addition 
to many locally labelled cells (Fig. 1c (1)), mCherry’ cells were 
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Figure 1 | Genetic control of rabies-mediated neural circuit tracing. 

a, b, Schematic representation of the methodology used to control the location, 
number and type of starter cells for rabies-virus-mediated trans-synaptic 
labelling. tT A2 is expressed in a small subset of CreER® cells (grey nuclei in 
b). tT A2 activates an AAV-delivered transgene to express: (1) a histone-GFP 
marker to label the nuclei of starter cells in green; (2) EnvA receptor (TVA) to 
enable subsequent infection by EnvA-pseudotyped rabies virus (rabies4G- 
mCherry+EnvA); and (3) rabies glycoprotein (G) to initiate trans-synaptic 
labelling. c, Top left, a 60-tym coronal section that includes the injection site in 
the motor cortex (M1). Cells labelled with both histone-GFP (nGFP) and 
mCherry (arrowheads in ¢ (1), magnified in c (3)) can be distinguished from 
cells labelled with mCherry alone, which are found near the injection site (c (1)), 
in the contralateral motor cortex (c (2)), in the somatosensory barrel cortex (top 
right; magnified in c (4)), and in the motor-specific ventrolateral nucleus of the 
thalamus (c (5)). Scale bars, 1 mm for low-magnification images at the top, 
100 um for high-magnification images at the bottom. 


enriched in layers II, III and V in the contralateral motor cortex 
(Fig. 1c (2)), consistent with layer specificity of callosal projections®. 
mCherry” cells were also found in layers III and V of the ipsilateral 
somatosensory cortex (Fig. 1c (4)) and in motor-specific thalamic 
nuclei (Fig. 1c (5)), which are known sources of monosynaptic inputs 
to the motor cortex”. 

In all experiments, histone-GFP © cells were found within 450 uum 
of the injection sites, consistent with a previous report that AAV 
serotype 2 predominantly infects neurons locally’®. Omitting AAV 
or tamoxifen yielded no trans-synaptically labelled neurons (Sup- 
plementary Fig. 4). Moreover, our strategy labelled neurons only 
through synaptic connections but not through axons in passage (Sup- 
plementary Fig. 5). Finally, rabies virus spread was restricted to 
neurons directly connected to starter cells, and only in the retrograde 
direction (Supplementary Fig. 6). Together, these experiments vali- 
dated our genetic strategy for retrograde mono-trans-synaptic label- 
ling in vivo. 
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AON maintains the dorsal-ventral topography 


In the mouse, olfactory receptor neurons that express a single type of 
odorant receptor send convergent axonal projections to a specific pair 
of glomeruli in the lateral and medial olfactory bulb"’’. Odorants are 
detected by combinations of olfactory receptor neuron classes", and 
are represented as spatiotemporal activity patterns of glomeruli”. 
Each mitral cell sends its apical dendrite to a single glomerulus and 
thus receives direct input from a single olfactory receptor neuron 
class. Mitral cell axons relay information to the olfactory cortex 
(Supplementary Fig. 1a). Previous axon tracing studies showed that 
individual mitral cells send axons to distinct cortical areas, and that 
small cortical regions receive broad input from the olfactory bulb'*. 
However, understanding the principles underlying odour perception 
and odour-mediated behaviours requires systematic and quantitative 
analysis of connection patterns of mitral cells with their cortical target 
neurons. 

We first established that mitral cells throughout the olfactory bulb 
can be infected by rabies virus via their axons (Supplementary Fig. 7). 
We then applied our strategy (Fig. 1a, b) to specific areas of the AON, 
piriform cortex and cortical amygdala (Supplementary Fig. 1b), and 
examined the distribution of trans-synaptically labelled mitral cells. In a 
typical example, 11 clustered starter cells in the AON (Fig. 2a) resulted in 
69 labelled mitral cells distributed widely across the olfactory bulb (Sup- 
plementary Fig. 8 and Supplementary Movie 1). Bright mCherry fluor- 
escence from rabies virus allowed us to unequivocally follow the primary 
dendrites of the labelled mitral cells to single target glomeruli (Fig. 2b). 
Each mitral cell sent its apical dendrite into a single glomerulus. Four 
glomeruli were each innervated by two labelled mitral cells (Fig. 2b, 
right, and Supplementary Table 1). 

To quantitatively compare the patterns of labelled glomeruli from 
different animals, we established a three-dimensional (3D) reconstruc- 
tion protocol for the olfactory bulb, and aligned each olfactory bulb toa 
standard olfactory bulb model (Fig. 2c). To test the accuracy of this 
procedure, we reconstructed and aligned olfactory bulbs from three 
P2-IRES-tauGFP transgenic mice”. These GFP-labelled glomeruli 
were located within a distance ofa few glomeruli from each other (Sup- 
plementary Fig. 9), consistent with the natural variability of olfactory 
receptor neuron axon targeting”’. This precision of our 3D reconstruc- 
tion enables the comparison of olfactory bulbs from different animals. 

The AON has been proposed to provide feedforward modification of 
information from the olfactory bulb to the piriform cortex”. Little is 
known about its organization except for a small and distinct AON pars 
externa, which maintains dorsal-ventral olfactory bulb topography”. 
We injected AAV and rabies virus to different areas of the AON (Sup- 
plementary Table 1), and established an AON 3D-reconstruction pro- 
tocol analogous to that for the olfactory bulb (Fig. 2c, left). Labelled 
glomeruli from AON injections were distributed widely in the olfactory 
bulb (Fig. 2c, middle). However, starter cells from the ventral and dorsal 
AON preferentially labelled ventral and dorsal glomeruli, respectively 
(Fig. 2c). To quantify the spatial distributions of starter cells in the AON 
and trans-synaptically labelled glomeruli in the olfactory bulb, we intro- 
duced a cylindrical coordinate system into the olfactory bulb and AON 
models, where Z represents the position along the anterior—posterior 
axis and 0 represents the angle from the polar axis (Fig. 2c). No correla- 
tions were found between Zaon and Jog (where OB is olfactory bulb), 
Zaon and Zog, or Oaon and Zoz (Supplementary Fig. 10a). However, 
we found a strong positive correlation (R? = 0.79) between O,on and 
Oop (Fig. 2d), which correspond to the dorsal-ventral axes of the AON 
and olfactory bulb, respectively. Thus, the AON maintains the dorsal- 
ventral topography of the olfactory bulb. 

A coarse topography exists between olfactory receptor neuron cell- 
body positions in the olfactory epithelium and target glomeruli in the 
olfactory bulb along the dorsal-ventral axis”®. Specifically, the olfactory 
cell adhesion molecule (OCAM) is expressed in a subset of olfactory 
receptor neurons” that project to the ventral ~55% of glomeruli in the 
olfactory bulb. In the olfactory bulb, ~25° clockwise rotation of the 
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Figure 2 | The olfactory bulb to AON connections show a dorsal-ventral 
topography. a, A 60-11m coronal section with two starter cells located in layer II 
of the ventrolateral AON, one of which (arrow) is magnified in the inset. RMS, 
rostral migratory stream. Scale bar, 500 um. b, Typical examples of trans- 
synaptically labelled mitral cells from cortical starter cells. Left, a 60-,1m coronal 
section that captures both the cell body and the apical dendrite of a mitral cell. 
Right, more frequently, a mitral cell apical dendrite spans several consecutive 
60-1um coronal sections. S-Glo, glomerulus innervated by a single labelled 
mitral cell (M). D-Glo, glomerulus innervated by two labelled mitral cells (Ma, 
Mb). A, anterior; P, posterior. Scale bar, 100 um. c, Superimposed 3D 
reconstructions of the AONs and olfactory bulbs (OBs) from two injected 
brains. Eleven red and four green starter cells from two AONs labelled red and 
green glomeruli, respectively. Light red and green, contours of two 
superimposed AONs. D, dorsal; L, lateral; M, medial, V, ventral. 

d, e, Correlations between O,on and oz (d) and @,on and 6' op (e). Crosses 
represent mean O,on (x-axis) and mean Oop or 0’ og (y-axis). Error bars 
represent 50% of the distribution surrounding the mean Oo, or 6’ op. R’, square 
of Pearson’s correlation coefficient; P, statistical significance tested against the 
null hypothesis assuming no correlation between 0,on and Oop or 8’op. Red 
and blue, experiments using actin-CreER and CaMKII-CreER”, respectively. 


polar axis around the z-axis maximized the separation of OCAM* and 
OCAM ™ glomeruli (Supplementary Fig. 11). In this new OCAM 
coordinate system represented by 0’ op (Fig. 2c, right), the correlation 
coefficient between 0’op and O,on increased to R* = 0.89 (Fig. 2e), 
showing that adjusting the dorsal-ventral axis of the olfactory bulb 
according to a biological marker improved the AON and olfactory bulb 
topographic correspondence. Thus, the topography between the 
olfactory epithelium and the olfactory bulb further extends to the 
AON. 


Dorsally biased olfactory bulb input to amygdala 

Mitral cell axons project to the anterior and posterolateral cortical 
amygdala**”’. The organization of this axonal input is unknown. We 
injected AAV and rabies virus to small areas within these regions, and 
mapped starter cells onto a common schematic drawing based on 
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anatomical landmarks (Fig. 3a). Trans-synaptically labelled mitral 
cells and glomeruli from amygdala starter cells were broadly distri- 
buted in the olfactory bulb. However, the labelled glomeruli were 
enriched in the dorsal olfactory bulb (Fig. 3c). For quantification, 
we compared the mean experimental 0’op for each injection with 
mean 0'oz values produced by computer simulation from the same 
number of glomeruli distributed randomly throughout the olfactory 
bulb (*'"0’oz). For the AON experiments, the mean experimental 
0’ ox values for the majority of the samples were significantly larger 
or smaller than the corresponding mean alee values (Fig. 3e, left), 
reflecting the dorsal-ventral topography between the olfactory bulb 
and the AON. By contrast, none of the mean @’ox values from the 
amygdala was significantly larger than the corresponding mean 
0’ op (Fig. 3e, middle). Six out of ten mean 0’og values from the 
cortical amygdala fell significantly below the corresponding mean 
*™' og values. For these dorsally biased samples, the density of 
labelled glomeruli gradually decreased along the dorsal—ventral axis 
without a sharp boundary (Supplementary Fig. 12). Simple spatial 
correspondence between starter-cell locations and the degree of dorsal 
bias was not evident (Supplementary Fig. 10b). In summary, the cortical 
amygdala overall receives biased input from the dorsal olfactory bulb. 


Less organized olfactory bulb input to piriform cortex 


The piriform cortex is the largest cortical area in the olfactory cortex. 
Recent physiological analysis***’ found that neurons activated by 
specific odours are apparently not spatially organized; the underlying 
anatomical basis is unclear. We injected AAV and rabies virus into 
several areas in the anterior and posterior piriform cortex, and 
mapped starter cells from different brains onto a common schematic 
drawing of the entire piriform cortex based on anatomical landmarks 
(Fig. 3b). Labelled glomeruli were broadly distributed throughout the 
olfactory bulb, regardless of starter-cell locations in the piriform cortex 
(Fig. 3d). In sharp contrast to trans-synaptic labelling from the AON or 
amygdala, where different samples showed highly variable mean 0’ op, 
mean 0'ox values from the piriform cortex tracings were much less 
variable and closely resembled a random distribution (Fig. 3e). Only 
one out of ten samples had a mean 0’ og slightly above the 95th per- 
centile of the mean *""0’ og. Further, no strong spatial correspondence 
was evident in correlation analyses of 0’op, Zog and the location of 
starter cells in the piriform cortex (Supplementary Fig. 10c). These data 
indicate that highly restricted areas of the piriform cortex receive direct 
mitral cell input representing glomeruli that are distributed through- 
out the olfactory bulb with no apparent spatial organization. 


Convergence of mitral cell input 

Convergent inputs from different glomeruli to individual cortical 
neurons could allow the olfactory cortex to integrate combinatorial 
odour representations in the olfactory bulb. In support of this, pre- 
vious studies have shown that odour receptive ranges of AON cells are 
broader than those of mitral cells**, and that some piriform cortex 
neurons are activated by a binary odour mix but not individual com- 
ponents*’. However, a large fraction of inputs in these studies could 
come from other cortical neurons through extensive recurrent con- 
nections (Figs 2a and 3a, b). Direct convergence of mitral cell axons 
onto individual cortical neurons is implied in physiological studies of 
piriform cortical neurons in slices****. Our trans-synaptic labelling 
enabled a direct examination of mitral cell convergence to individual 
cortical neurons in vivo. 

The convergence index, defined by the number of labelled mitral 
cells divided by the number of the starter cells in the cortex, exceeded 1 
in all experiments using actin-CreER (Fig. 4a and Supplementary 
Table 1). This finding demonstrates that individual cortical neurons 
receive direct inputs from multiple mitral cells in vivo. As the vast 
majority of labelled mitral cells corresponded to different glomeruli 
(Supplementary Table 1), individual cortical neurons must receive 
direct inputs representing multiple glomeruli. This convergence index 
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Figure 3 | Representations of olfactory bulb input in the amygdala and 
piriform cortex. a, b, Starter cells from the cortical amygdala and piriform 
cortex. Left, single coronal sections at the injection sites in the posterolateral 
cortical amygdala (a) and the posterior piriform cortex (b). Arrows point to 
starter cells magnified in insets. Scale bars, 100 jim in a, 200 jm in b. Right, 
schematic representations of ten independent injections each into amygdala 
(a) or piriform cortex (b). Starter cells from each injection are labelled with a 
specific colour. The dotted line denotes the rough border between the anterior 
cortical amygdala (ACo) and the posterolateral cortical amygdala (PLCo) based 
on anatomical landmarks according to a mouse brain atlas**. APC, anterior 
piriform cortex; En, lateral entorhinal cortex; ME, medial amygdala; nLOT, 


is probably an underestimate, as not all starter cells necessarily 
received direct mitral cell input (overestimation of the denominator), 
and not all cells presynaptic to starter cells were trans-synaptically 
infected by the rabies virus (underestimation of the numerator; see 
Supplementary Fig. 3). 

The convergence indices varied widely in different experiments, and 
did not differ substantially in the three cortical areas we examined. 
However, in experiments that contained starter cells located in layer I, 
which is mostly composed of GABAergic local interneurons*’, the 
convergence indices were greater (Fig. 4a, red). Assuming all starter 
cells in a given layer contribute equally to mitral cell labelling, multiple 
regression analyses indicate that layer I neurons receive direct input 
from more mitral cells than layer II/III neurons (Fig. 4b). 

To confirm the higher convergence index for layer I GABAergic 
neurons, we replaced the ubiquitous actin-CreER with GAD2-CreER, 
which is expressed only in GABAergic interneurons (Supplementary 
Fig. 13). We found that GABAergic neurons located in layer II or III of 
the piriform cortex received little direct mitral cell input, whereas 
those located in layer I showed a much greater convergence index 
(Fig. 4b, right, and Supplementary Table 1). Thus, cortical GABAergic 
neurons are highly diverse with respect to mitral cell innervation. These 
observations are in accordance with recent physiological studies*®”’, 
and suggest different physiological roles for these GABAergic neurons; 
layer I and deeper layer GABAergic neurons provide global feedforward 
and feedback inhibition to cortical pyramidal neurons, respectively. 


Sister mitral cells connect independently 

Each glomerulus is innervated by the apical dendrites of ~25 elec- 
trically coupled mitral cells*’. We refer to these cells as ‘sister’ mitral 
cells. Sister mitral cells may preferentially connect to the same cortical 
postsynaptic target neurons compared to ‘non-sister’ mitral cells that 
receive direct input from different glomeruli. Such an organization 
could increase the signal-to-noise ratio in information transmission 
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nucleus of lateral olfactory tract; PMCo, posteromedial cortical amygdala; PPC, 
posterior piriform cortex. c, d, Superposition of three independent 3D 
reconstructions of glomerular maps with starter cells from the cortical 
amygdala (AMY; c) or the piriform cortex (PC; d). e, Mean 0’ox values 
(crosses) from each experiment are plotted in the same column with the 95% 
confidence intervals for corresponding “”6’ og values (grey bars). Samples with 
experimental mean 6’ og outside the 95% confidence intervals are labelled with 
asterisks (*P < 0.05). Colours in a (scheme), c and e (amygdala) are matched to 
represent the same samples, and so are the colours in b (scheme), d and 

e (piriform cortex). 


from mitral cells to cortical neurons. Alternatively, sister mitral cells 
may connect to cortical neurons independently to deliver olfactory 
information widely across different cortical neurons. 

We used the frequency of dually labelled glomeruli from our data 
set and statistical simulation to distinguish between these possibilities. 
Dually labelled glomeruli (D) could result from a single starter cell 
(Ds) or two starter cells (Dt). Assuming that an individual starter cell 
can receive input from any of the 2,000 glomeruli, we compared the 
distribution of Ds derived from our data and from a simulation 
according to the null hypothesis that sister mitral cells connect inde- 
pendently with postsynaptic targets. If sister mitral cells share signifi- 
cantly more postsynaptic targets than at random, then the ‘data Ds’ 
distribution should be significantly higher than the simulated ‘random 
Ds’ distribution. In all but two cases, these two distributions were not 
statistically different (Fig. 4c). Both exceptions came from trans- 
synaptic labelling from the AON, which showed dorsal—ventral topo- 
graphy, so the original assumptions were not accurate. When we 
reduced the number of accessible glomeruli to 1,500, no sample 
showed significant differences. Thus, our analysis indicates that indi- 
vidual mitral cells innervating the same glomerulus act independently 
in making connections with their cortical targets. 


Discussion 


Our study revealed several general principles that define cortical repre- 
sentations of the olfactory bulb input. First, individual cortical neurons 
receive direct input from mitral cells originating from multiple glom- 
eruli. On average, each excitatory neuron receives direct input from 
four mitral cells, but this number is likely to be an underestimate. 
Convergence of mitral cell inputs enables cortical neurons to integrate 
information from discrete olfactory channels. The lower bound of four 
already affords ~10'* glomerular combinations for 1,000 olfactory 
channels, far exceeding the number of neurons in the mouse olfactory 
cortex. Thus, the olfactory cortical neuron repertoire samples only a 
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Figure 4 | Convergence and independence of mitral cell inputs. 

a, Convergence index for each cortical injection experiment is represented by a 
diamond, with the type and layer of starter cells specified by the colour code 
above. AMY, amygdala; PC, piriform cortex. b, Multiple regression analysis to 
estimate the convergence indices of starter cells located in different layers of the 
AON and piriform cortex. Estimated mean convergence indices (red crosses) 
and the corresponding 95% confidence intervals (grey bars) are shown. Data 
from actin-CreER and GAD2-CreER were analysed separately. Injections into 
amygdala produced only one sample that contained layer I cells and were 
therefore excluded. c, Schematic of dually labelled glomeruli (D) resulting from 
two starter cells (Dt) or a single starter cell (Ds). Comparison of the 
distributions of Ds derived from experimentally observed frequency of D (Data 
Ds; red) and from simulated D based on the null hypothesis detailed in 
Methods (Random Ds; blue). For each sample, the distributions of ‘data Ds’ and 
‘random Ds’ are shown by coloured heat maps. *P < 0.05. 


small fraction of all possible combinations of direct olfactory bulb 
inputs. 

Second, neurons restricted to small olfactory cortical regions receive 
input from glomeruli that are broadly distributed in the olfactory bulb. 
Although similar findings were reported previously'®'’, our study 
provides a higher resolution analysis of direct connectivity between 
mitral cells and cortical neurons, rather than inferring connection from 
the presence of axons, which could be a major caveat of previous 
tracing studies (see Supplementary Fig. 5). At the same time, mitral 
cells representing the same glomerulus connect independently to post- 
synaptic cortical neurons, thus maximizing the spread of olfactory 
information originating from individual olfactory channels. Our find- 
ing is consistent with analyses of axon arborization patterns of singly 
labelled mitral cells (S. Ghosh and colleagues; manuscript submitted). 

Third, different cortical areas receive differentially organized olfactory 
bulb input (Supplementary Fig. 1c). The AON maintains a coarse topo- 
graphy along the dorsal-ventral axis, suggesting a pre-processing role for 
olfactory-bulb-derived information before sending to other cortical 
areas. A lack of apparent spatial organization in the piriform cortex with 
regard to olfactory bulb input provides an anatomical basis for recent 
physiological studies***', and suggests that the piriform cortex acts as an 
association cortex’’”*. In the cortical amygdala, many neurons seem to 
receive strongly biased input from the dorsal olfactory bulb. Mice lacking 
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olfactory receptor neurons that project to the dorsal olfactory bulb lose 
their innate avoidance of odours from predator urine and spoiled 
food, despite retaining the ability to sense these odours”’. The cortical 
amygdala may preferentially process the olfactory information that 
directs innate behaviours. Our study is in agreement with similar find- 
ings using axon tracings from individual glomeruli (D. L. Sosulski and 
colleagues; manuscript submitted). 

Interestingly, axonal arborization patterns of Drosophila olfactory 
projection neurons (equivalent to mitral cells) in higher olfactory 
centres show a similar organizational principle. Projection neuron axon 
arborization patterns in the lateral horn—a processing centre directing 
odour-mediated innate behaviour—are highly stereotyped with respect 
to projection neuron classes**”’, and are partitioned according to the 
biological significance of the odorants”. Arborization patterns of axon 
collaterals of the same projection neurons in the mushroom body, an 
olfactory memory centre’, are much less stereotyped**”’, consistent 
with a physiological study indicating non-stereotyped connections”. 
Therefore, from insects to mammals, a common theme emerges for 
the representations of olfactory information: more stereotyped and 
selective representation of odours is necessary for directing innate beha- 
viours, whereas broader and less stereotyped sampling of the whole 
olfactory space is better suited for brain regions implicated in associative 
memories. 

The genetically controlled mono-trans-synaptic tracing described 
here should be widely applicable for mapping neuronal circuitry 
throughout the mouse brain. It is currently unknown how rabies virus 
crosses synapses, and whether the efficiency and specificity vary with 
cell type, connection strength and activity***’. Further applications of 
these trans-synaptic methods to other neurons and circuits** will be 
necessary to address these questions. Nevertheless, the control experi- 
ments (Fig. 1 and Supplementary Figs 2-6) confirmed that our strategy 
labels neurons that are directly presynaptic to starter cells but not 
neurons whose axons pass through the injection sites without making 
synapses. Our method will be especially valuable for analysing long- 
distance connections that are usually refractory to physiological map- 
ping strategies’. This method can be further extended to genetic 
manipulation of starter cells to combine circuit tracing with genetic 
loss- or gain-of-function experiments. These approaches will facilitate 
the investigation of not only the organization of information flow 
within neural circuits, but also the molecular basis of neuronal con- 
nections at single-cell resolution in vivo. 


METHODS SUMMARY 


Detailed methods on the generation of CAG-stop-tTA2 mice, viral preparations, 
animal surgery, tissue processing, 3D reconstruction and quantitative analyses 
can be found in Methods. 


Full Methods and any associated references are available in the online version of 
the paper at www.nature.com/nature. 
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METHODS 

Generation of CAG-stop-tTA2 Mice. The tTA2 transactivator gene” was placed 
after the CAG promoter of plasmid pCA-H7Z2 (ref. 50) using a polymerase chain 
reaction (PCR)-based cloning method. A neomycin resistance (neo") gene and a 
transcriptional stop signal”' were flanked by loxP sites to create a loxP-neo’-stop- 
loxP cassette. This cassette was then introduced between the CAG promoter and 
tTA2 using PCR-based cloning. An EcoRI fragment obtained from ETLpA-/ 
LTNL", which contains the IRES-tau-lacZ cassette, was introduced after the 
tTA2 coding sequence. The resulting cassette (CAG-stop-tTA2-IRES-tau-lacZ) 
was cloned into pBT264 to flank the cassette with two copies of a ~250-bp 
B-globin HS4 insulator sequence” on each side. pBT264 (pii- TRE-tdTomato- 
3Myc-ii) was generated by inserting PCR-amplified copies of ~250-bp-long core 
insulator fragments (i) from the chicken B-globin HS4 insulator on each side of 
TRE-tdTomato-3Myc in pBT239. The insulator fragments were amplified from 
pJC13-1 (ref. 53). The final construct, pKM1 (pii-CAG-stop-tTA2-IRES-tau- 
lacZii), was tested by transient co-transfection with pBT264 into cultured 
HEK293 cells. When a Cre-encoding plasmid pBT140 (cytomegalovirus 
(CMV) promoter driving nuclear localization signal-Cre) was further introduced 
into the same cell, strong tdTomato fluorescence was detected 72 h after transfec- 
tion. pKM2 was digested with restriction enzymes Swal and Ascl, the insert was 
gel-purified using Qiagen gel extraction kit and eluted into 10 mM Tris-HCl, pH 
7.4, 0.1 mM EDTA. The purified and linearized DNA devoid of plasmid backbone 
was used for mouse transgenesis via standard pronuclear injection procedure. 
Founders were screened by PCR primers to detect the neo’ gene. Four independ- 
ent transgenic lines were established. They were crossed with mice containing 
B-actin-CreER’ and TRE-Bi-SG-T reporter” to screen for functional CAG-stop- 
tTA2 transgenes. Mice containing all three transgenes were injected with 1 mg of 
tamoxifen in corn oil at postnatal day (PD)10, and brains were collected at PD21 
for the analysis. Two lines showed broad tdTomato fluorescence throughout the 
brain. One line (containing 2-3 copies of the transgene based on Southern blot- 
ting) was used exclusively in this study. 

Virus preparations. All viral procedures followed the Biosafety Guidelines 
approved by the Stanford University Administrative Panel on Laboratory 
Animal Care (A-PLAC) and Administrative Panel of Biosafety (APB). To make 
the AAV containing the TRE-HTG cassette, which encodes histone-GFP, TVA 
and G linked by the 2A ‘self-cleaving’ peptides, the HTG cassette obtained from 
pBOB-synP-HTB (I.W. and E.M.C., unpublished plasmid) was placed after the 
TRE-Tight promoter in pTRE-Tight (Clontech), and then the entire construct was 
subcloned into the pAAV vector (Stratagene). Recombinant AAV serotype 2 was 
produced using the pAAV helper free kit (Stratagene) according to the manu- 
facturer’s instructions. AAV was also produced commercially by the Gene 
Therapy Center of the University of North Carolina. The AAV titre was estimated 
to be ~4 X 10!” viral particles ml! based on serial dilution and blot hybridization 
analysis. Pseudotyped AG rabies virus was prepared as previously described***. The 
pseudotyped rabies virus titre was estimated to be ~5 X 10° infectious particles per 
ml based on the infections of cell line 293-TV A800 by serially diluted virus stocks. 
Animal surgery. All animal procedures followed animal care guidelines 
approved by A-PLAC. To activate Cre in animals carrying a CreER transgene, 
we injected intraperitoneally 0.1-1 mg of tamoxifen (Sigma) dissolved in corn oil 
into mice around PD10. For trans-synaptic labelling, 0.1-0.3 pl of AAV-TRE- 
HTG was injected into brain at PD21 by using a stereotactic apparatus (KOPF). 
During surgery, animals were anaesthetized with 65 mg kg ' ketamine and 13 mg 
kg”! xylazine (Ben Venue Laboratories). For motor cortex injections, the needle 
was placed 1.5mm anterior and 1.5mm lateral from the Bregma, and 0.4mm 
from the brain surface. For olfactory cortex injections, see Supplementary Fig. 1b 
for the stereotactic parameters. After recovery, animals were housed in regular 
12h dark/light cycles with food and water ad libitum. Two weeks later, 0.3 pl of 
pseudotyped rabies virus (AG-mCherry+EnvA) was injected into the same brain 
location under anaesthesia. After recovery, animals were housed in a biosafety 
room for 7 days to allow rabies virus to infect, trans-synaptically spread and 
express sufficient amount of mCherry to label presynaptic cells. All animals were 
healthy and their brain structures were normal 7 days after rabies virus infection, 
confirming non-pathogenicity of AG mutant rabies virus. 

Tissue processing. Brain tissue was processed according to previously described 
procedures**. To set the common coronal plane among different animals, the 
cerebellum was cut off and the brain was embedded in the Optimum Cutting 
Temperature (OCT) compound (Tissue-Tek) with the cut surface facing the 
bottom of the mould. The brain was adjusted to ensure that the left-right axis 
was parallel to the section plane. Neither mCherry nor histone-GFP required 
immunostaining for visualization. In some cases, brain sections were immuno- 
stained for better signal preservation according to previously published meth- 
ods”® using the following antibodies: chicken anti-GFP (1:500; Aves Labs), rabbit 
anti-DsRed (1:1,000; Clontech), donkey anti-chicken fluorescein isothiocyanate 
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(FITC) and donkey anti-rabbit Cy3 (1:200; Jackson ImmunoResearch). In most 
trans-synaptic labelling experiments starting from the olfactory cortex, every one 
of four sections of the olfactory bulb was immunostained by the free-floating 
method with goat anti-OCAM (1:100; R&D Systems) and donkey anti-goat 
Alexa488 (Invitrogen) to label OCAM* olfactory receptor neuron axons. For 
immunostaining against GABA, 60-j1m free-floating coronal sections were 
treated with rabbit anti-GABA (1:2,000 in PBS with 0.3% Triton-X100; Sigma) 
for 48h. GABA® cells were visualized with donkey anti-rabbit Cy3 (1:200; 
Jackson ImmunoResearch). Sections were imaged with a Nikon CCD camera 
by using a 10X objective or by 1-j1m optical sectioning using confocal microscopy 
(Zeiss 510). 

3D reconstruction. To compare distribution of labelled glomeruli (olfactory 
bulb) and starter cells (AON) across different samples, we needed to map them 
in acommon 3D reference frame. To do this, we first saved manual annotations 
carried out in Adobe Illustrator in a scalable vector graphics (SVG) format. The 
SVG file saved all the annotations as an extensible markup language (XML) file 
describing the ellipses and contours (defined later), making it feasible to accurately 
parse the information by MATLAB scripts. In the olfactory bulb, we represented all 
glomeruli as ellipses. We used the centre of mass for each ellipse to define a single 
point, and calculated the centre of mass of all the points to define the centre of each 
slice. For the AON, we defined the contour as the boundary between layer I and 
layer II, which can be clearly distinguished by differences in the density of 4’,6- 
diamidino-2-phenylindole (DAPI) staining. To define the centre of mass for each 
contour, we replaced it with a dense series of points and used these points to 
calculate the centre of mass. Now, each slice is represented by a series of points 
and the centre of mass contained within an SVG file. To assemble the slices 
represented by SVG files into a 3D shape, we first aligned the centre of mass for 
each slice to that of the previous slice to form the cylindrical (z-)axis. Then, we 
refined the alignment by sequentially applying the iterative closest points (ICP) 
algorithm”®, which can identify the local rotation and translation parameters for 
each slice to maximize the overlap with the previous slice. Once we had aligned all 
the slices in a sample to generate a 3D shape, we needed to identify an orientation 
for the polar axis that could be most reliably identified in different 3D reconstruc- 
tions. As the olfactory bulb is ellipsoidal, the principle component analysis (PCA) 
can reliably find a plane that contains the z-axis and intersects the 3D shape to 
maximize the surface of the intersection (plane m). We then defined the polar axis 
to be contained within the plane m, perpendicular to the z-axis, and pointing in the 
dorsal direction. For the AON, we approximated the contours of the most posterior 
slide of the AON as a triangle and calculated the rotation around the z-axis that 
minimizes the distance of the three vertices to those of a standard AON sample. We 
applied the same rotation to the whole 3D shape. To define the orientation of the 
polar axis, we used the side of the triangle that connects two of its medial vertices 
and points in the dorsal direction. Then we defined the polar axis as the line that is 
parallel to it and that intersects the z-axis. Finally, we calculated the volume occu- 
pied by each shape and applied a uniform scaling factor to account for different sizes 
of the anatomical structures in different animals. 

All the steps explained earlier were implemented in MATLAB, which ran 
automatically without human intervention to avoid biasing the registration 
results. Once we had registered each shape, we used a standard algorithm to 
extract surfaces from two-dimensional (2D) contours” to transform the point 
cloud into a triangulated mesh that could be saved in the visualization toolkit 
(VTK)°** format for visualization and analysis purposes. 

We used the following landmarks to map starter cells in the amygdala and 

piriform cortex (Fig. 3): appearance of the olfactory tubercle (Fig. 3, I); end of the 
olfactory tubercle (Fig. 3, II); appearance of the hippocampus (Fig. 3, III); and 
appearance of the dentate gyrus of the hippocampus, on the ventral edge of the 
cortex (Fig. 3, IV). 
Quantitative analyses. For each tracing experiment where we analysed the dis- 
tribution of labelled glomeruli along the dorsal-ventral axis using mean 0’op 
(Fig. 3e), we generated a corresponding random distribution of simulated mean 
0'ox (mean “"0'og) from M glomeruli, where M is the number of labelled 
glomeruli in the injection. To generate this random distribution for each experi- 
ment, we randomly selected M glomeruli from a given 3D reconstruction model 
(generated from that injection) and calculated the mean sim@' op value for those 
randomly selected M glomeruli to get the mean *"0’;. We then repeated the same 
simulation 50,000 times to obtain mean “"0’5,..., mean *"0" 59,999, and therefore 
to obtain the range of mean “"0' for M glomeruli that are randomly distributed 
throughout the olfactory bulb. Once we obtained distributions for mean #26! on 
that corresponded to each injection, we compared the mean “"0’ og distribution 
with the experimental mean 0 og. If the value for the experimental mean 0’ oz was 
outside of the 95% of the mean *""0’ op distribution, we considered the glomerular 
distribution to be non-random for that sample. 
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Multiple regression analysis (Fig. 4b) was conducted by using Excel (Microsoft). 
Data from every experiment in Supplementary Table 1 (1 = 8 for the AON, n = 10 
for the piriform cortex using actin-CreER) was used for the left part of the Fig. 4b. 
Data from seven experiments obtained from GAD2-CreER in the anterior piriform 
cortex were used in Fig. 4b, right. The number of labelled mitral cells in the 
olfactory bulb was set as a dependent variable, Y, and the number of starter cells 
in layer k (k = I, IL, II) was set as an independent variable, X;.. The constant was set 
to zero. Excel then calculated the values of coefficients A; (shown by red crosses in 
Fig. 4b) and 95% confidence intervals of A; based on the student’s t-test (shown by 
grey bars in Fig, 4b). R values for these multiple regression assays were: 0.98 for the 
AON; 0.96 for the piriform cortex (actin-CreER); and 0.97 for the piriform cortex 
(GAD2-CreER) data sets. 

To estimate the number of dually labelled glomeruli originating from single 
starter cells (data Ds in Fig. 4c) in our experimental data, we first simulated a 
hypothetical number of dually labelled glomeruli originating from single starter 
cells (Ds) and two independent starter cells (Dt) according to the null hypothesis 
that mitral cells connect randomly with postsynaptic targets. This situation can be 
modelled by ‘balls and bins’: there are 2,000 bins (a bin represents a single 
glomerulus) and N balls (a ball represents a single trans-synaptic labelling event). 
N balls were randomly thrown into 2,000 bins, and the number of bins that 
received more than one ball (that is, glomeruli labelled more than once) was 
counted. To distinguish Ds from Dt, we further introduced n different colours 
to the balls, where each colour represented an individual starter cell in the cortex. 
We assumed that an equal number of balls (N/m) were labelled with n different 
colours. Each ball was randomly thrown into one of 2,000 bins, and the number of 
bins containing more than one ball was counted. We separately counted the bins 
with more than one ball of an identical colour (representing Ds) and the bins with 
more than one ball of different colours (representing Dt). We fixed the number of 
bins (glomeruli) to be 2,000, while N and n corresponded to the number of 
labelled mitral cells and the number of starter cells, respectively, in each experi- 
ment. We repeated this simulation 100,000 times for each set of N and n to 
obtain the simulated distribution of Ds and Dt (we call these ‘random Ds’ and 


‘random Dt). To estimate the Ds components in experimental data (data Ds), we 
assumed that individual starter cells contributed independently to the labelling 
(random Dt = data Dt). On the basis of the equation: D (number of observed 
dually labelled glomeruli) = Ds + Dt, we estimated the data Ds distribution by 
subtracting the random Dt from observed D (Fig. 4c). Then we determined if 
there was a significant difference in the distribution of data Ds and random Ds. 
We considered two distributions to be significantly different if the probability of 
data Ds > random Ds or data Ds < random Ds exceeded 0.95 (shown by asterisks 
in Fig. 4c). To accurately count dually labelled glomeruli, samples with more than 
200 labelled mitral cells were excluded from this analysis. 
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Drosophila melanogaster is one of the most well studied genetic model organisms; nonetheless, its genome still contains 
unannotated coding and non-coding genes, transcripts, exons and RNA editing sites. Full discovery and annotation are 
pre-requisites for understanding how the regulation of transcription, splicing and RNA editing directs the development 
of this complex organism. Here we used RNA-Segq, tiling microarrays and cDNA sequencing to explore the transcriptome 
in 30 distinct developmental stages. We identified 111,195 new elements, including thousands of genes, coding and 
non-coding transcripts, exons, splicing and editing events, and inferred protein isoforms that previously eluded 
discovery using established experimental, prediction and conservation-based approaches. These data substantially 
expand the number of known transcribed elements in the Drosophila genome and provide a high-resolution view of 


transcriptome dynamics throughout development. 


Drosophila melanogaster is an important non-mammalian model sys- 
tem that has had a critical role in basic biological discoveries, such as 
identifying chromosomes as the carriers of genetic information’ and 
uncovering the role of genes in development”’. Because it shares a 
substantial genic content with humans*, Drosophila is increasingly 
used as a translational model for human development, homeostasis 
and disease’. 

High-quality maps are needed for all functional genomic elements. 
Previous studies demonstrated that a rich collection of genes is 
deployed during the life cycle of the fly**. Although expression pro- 
filing using microarrays has revealed the expression of ~ 13,000 anno- 
tated genes, it is difficult to map splice junctions and individual base 
modifications generated by RNA editing? using such approaches. 
Single-base resolution is essential to define precisely the elements that 
comprise the Drosophila transcriptome. 

Estimates of the number of transcript isoforms are less accurate than 
estimates of the number of genes. Whereas ~20% of Drosophila genes 
are annotated as encoding alternatively spliced pre-emRNAs, splice- 
junction microarray experiments indicate that this number is at least 
40% (ref. 7). Determining the diversity of mRNAs generated by 
alternative promoters, alternative splicing and RNA editing will sub- 
stantially increase the inferred protein repertoire. Non-coding RNA 
genes (ncRNAs) including short interfering RNAs (siRNAs) and 


microRNAS (miRNAs) (reviewed in ref. 10), and longer ncRNAs 
such as bxd (ref. 11) and rox (ref. 12), have important roles in gene 
regulation, whereas others such as small nucleolar RNAs (snoRNAs) 
and small nuclear RNAs (snRNAs) are important components of 
macromolecular machines such as the ribosome and spliceosome. 
The transcription and processing of these ncRNAs must also be fully 
documented and mapped. 

As part of the modENCODE project to annotate the functional ele- 
ments of the D. melanogaster and Caenorhabditis elegans genomes'*"*, 
we used RNA-Seq and tiling microarrays to sample the Drosophila 
transcriptome at unprecedented depth throughout development from 
early embryo to ageing male and female adults. We report on a high- 
resolution view of the discovery, structure and dynamic expression of 
the D. melanogaster transcriptome. 


Strategy for characterization of the transcriptome 


To discover new transcribed features (Supplementary Table 1) and 
comprehensively characterize their expression dynamics throughout 
development, we conducted complementary tiling microarray and 
RNA-Seq experiments using RNA isolated from 30 whole-animal 
samples representing 27 distinct stages of development (Supplemen- 
tary Table 2). These included 12 embryonic samples collected at 2-h 
intervals for 24 h, six larval, six pupal and three sexed adult stages at 1,5 
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Figure 1 | Discovery of new RNAs in the Bithorax complex. Genomic 
organization and experimental evidence for new transcripts located between 
the HOX genes, abd-A and Abd-B, based on short poly(A)* RNA and total 
RNA-Seq expression profiles. The numbers to the left of each track indicate the 


and 30days after eclosion. We used 38-base-pair (bp) resolution 
genome tiling microarrays to analyse total RNA from all 30 biological 
samples and poly(A)* mRNA from the 12 embryonic samples (Sup- 
plementary Fig. 1). To attain single-nucleotide resolution and to facili- 
tate the analysis of alternative splicing and RNA editing, we performed 
non-strand-specific poly(A)’ RNA-Seq from all 30 samples generat- 
ing a combination of single and paired-end ~75-bp reads on the 
Illumina Genome Analyser IIx platform (short poly(A)* RNA-Seq) 
(Supplementary Table 3 and Supplementary Fig. 2). To identify 
primary transcripts and non-coding RNAs, the 12 embryonic time 
points were also interrogated with strand-specific 50-bp sequence reads 
from partially rRNA-depleted total RNA on the Applied Biosystems 
SOLiD platform (Supplementary Table 4 and Supplementary Fig. 3). To 
improve connectivity, mixed-stage embryos, adult males and adult 
females were used to generate ~250-bp reads on the Roche 454 plat- 
form (non-strand-specific long poly(A)" RNA-Seq) (Supplementary 
Table 5). In total, we generated 176,962,906,041 bp of mapped sequence 
representing 1,266-fold coverage of the genome and 5,902-fold coverage 
of the annotated D. melanogaster transcriptome. 


Discovery of new transcribed regions 


We identified 1,938 new transcribed regions (NTRs) not linked to any 
annotated gene models. Herein, ‘transcripts’ refer to RNA molecules 
synthesized from a genomic locus whereas ‘genes’ refer to one or more 
transcripts that share exons in their mature spliced form. modENCODE 
cDNAs fully support 13% of the NTRs (Supplementary Fig. 4) and 
partially support 23%. Most NTRs (84%) are detected by poly(A)* 
RNA-Seq, 44% by total RNA-Seq and 42% by tiling array. 
Approximately half of the NTRs are conserved in the distantly related 
Drosophila pseudoobscura and Drosophila mojavensis (Supplementary 
Fig. 4b) and 30% of these are detected by poly(A) * RNA-Seq data from 
D. pseudoobscura or D. mojavensis adult heads (Supplementary Fig. 4c, 
d, Supplementary Table 6 and Supplementary Methods). The NTRs 
probably eluded previous detection because they are expressed at low 
levels, in temporally restricted patterns, and are enriched for single-exon 
genes. The new multi-exon gene models (48%) have fewer, shorter and 
less conserved exons than annotated genes. 

Nearly one-third of the NTRs have a predicted open reading frame 
(ORF) greater than 100 amino acids. The remaining NTRs could 
encode small peptides but many are likely to be non-coding RNAs. 
A small fraction (9%) of NTRs are heterochromatic; most of these 
(232) have sequence similarity (greater than 100-nucleotide match 
and greater than 60% identity) to transposable elements (TEs) and 
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maximal number of reads for that sample. Three manually curated junction- 
based transcript models are shown; the green transcript model was fully 
validated by a cDNA, MIP06894. 


represent transcribed TEs or TE fragments. It remains to be deter- 
mined whether these regions have any function, although recent studies 
describe TE-associated regions that have acquired functions'*””. 

Even in the well-studied Bithorax complex’ we found an NTR. 
Known genetic breakpoints in the infra-abdominal regions iab-3 to 
iab-8, which lie between the homeotic genes abdominal A (abd-A) and 
Abdominal B (Abd-B), disrupt normal male development and affect 
fertility'*’°. Within this region are regulatory elements” and evidence 
for long non-coding RNAs that have eluded detection for over 
20 years*’**. We used the RNA-Seq data to infer the structures of at 
least three overlapping transcripts and validated one form (Fig. 1). 
The RNAs are expressed in embryos and adult males but not females. 
On the basis of the presumed role of this new gene and spatial expres- 
sion in the embryonic gonad (data not shown), we have named it male 
specific abdominal (msa). The cDNA contains short ORFs that are 
conserved in the melanogaster subgroup and could encode male- 
specific peptides. Whether they function as regulatory and/or as 
peptide-encoding RNAs is an important question for understanding 
development and segmental morphological diversity. 


Discovery of small ncRNAs 


We identified 37 unannotated intron-encoded and two unannotated 
intergenic small ncRNAs (<300 nucleotides) with an average frag- 
ments per kilobase of transcript per million fragments mapped 
(FPKM)* >20 from total embryonic RNA-Seq (Fig. 2 and Sup- 
plementary Table 7). Most of these ncRNAs are highly conserved in 
Drosophila sibling species*. We found published but unannotated 
ncRNAs: a U4atac snRNA” and four small Cajal-body-specific RNAs 
(scaRNAs)””. Of the remaining 34 ncRNAs, three are box C/D-like 
snoRNAs, 28 are box H/ACA-like snRNAs, one is a scaRNA-like 
RNA, and two are unclassified. One-third of these are located in the 
introns of genes encoding RNA-binding proteins, the majority of which 
are involved in pre-mRNA splicing (x16, SC35, tra2, Dek, Prp8, Tudor- 
SN, and pUf68). 


Discovery of microRNA primary transcripts 

MicroRNAs are processed from primary microRNA transcripts (pri- 
miRNAs) and are either independently transcribed or embedded in the 
introns of protein-coding genes. We identified 23 putative indepen- 
dently transcribed pri-miRNAs from the total embryonic RNA-Seq 
and tiling array data that encode 37 annotated miRNAs (Supplemen- 
tary Table 8). Only two primary transcripts were previously annotated 
(bft and iab-4). The pri-miRNAs range from 1 to 18 kb and terminate 
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Figure 2 | Discovery of small non-coding RNAs. a, Poly(A)* (yellow) and 
total RNA (blue) data from 10-12-h embryos are shown for the gp210 gene 
which hosts a representative new snoRNA. The maximal number of reads in the 
poly(A) and total RNA-Seq data are shown on the left and right of the track, 
respectively. b, The predicted RNA secondary structure of snoRNAgp219 is 
characteristic of a H/ACA-box snoRNA. Nucleotides that are 100% conserved 
in sequence or base-pairing are indicated in green and blue, respectively. 

c, Embryonic expression of the new small RNAs. The scale bar indicates FPKM 
Z-scores. unsRNA, unclassified small RNA. 


at the mature miRNA (pre-mir-315, Supplementary Fig. 5a). Twelve of 
the 23 precursors have cap analysis of gene expression (CAGE) peaks 
that map at their initiation sites**. pri-miRNA expression is dynamic in 
embryonic development (Supplementary Fig. 5b). 


Overview of the Drosophila transcriptome 


We calculated expression levels of annotated genes, transcripts and 
NTRs (Supplementary Table 9) in the short poly(A)" RNA-Seq and 
tiling array data sets. From the RNA-Seq data we detected expression 
of 14,862 genes (Supplementary Fig. 7a) and 36,274 transcripts 
(Fig. 3a) with an FPKM >1 (Supplementary Tables 9-18) of which 
67% of genes and 58% of transcripts were also observed in the array 
data (score >300) (Supplementary Fig. 6 and Supplementary Tables 
19 and 20). This includes the confirmation of 87% of annotated genes 
and transcripts and the discovery of 17,745 new transcripts. In addi- 
tion, from the total RNA-Seq data we detected expression of 12,854 
genes and 32,139 transcripts with an FPKM >1 (Supplementary 
Tables 12, 13,21 and 22) of which 77% of genes and 89% of transcripts 
were also observed in the array data. Of the genes and transcripts 
observed exclusively in the total RNA-Seq data, 519 genes and 
1,005 transcripts (primarily noncoding) were previously annotated 
and 122 genes and 1,422 transcripts are new discoveries. The genes 
and transcripts not detected in any data set include small genes 
(<200 bp), members of multi-copy gene families such as ribosomal 
RNAs, paralogues (expected owing to our mapping parameters), 
genes known to be expressed at low levels or in small numbers of cells 
(for example, gustatory and odorant receptor genes), and non- 
polyadenylated transcripts. 


Expression dynamics 

We examined the dynamics of gene expression throughout development 
using the short poly(A)* RNA-Seq data. The numbers of expressed 
genes (FPKM >1) (Supplementary Fig. 7a) and transcripts (Fig. 3a) 
gradually increases, from 7,045 (0-2 h embryos) to 12,000 (adult males). 
Adult males express ~3,000 more genes than adult females, consistent 
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Figure 3 | Dynamics of gene expression. a, Transcripts expressed (FPKM 
>1) in the short poly(A) ~ RNA-Seq data: FlyBase 5.12, blue; modENCODE, 
purple. The bar graphs indicate the number of transcripts expressed in each 
sample (Supplementary Table 1); the lines indicate the cumulative number of 
expressed transcripts. The lighter blue and purple lines indicate the cumulative 
number of transcripts expressed in the embryonic total RNA-Seq samples. The 
horizontal dotted lines indicate the number of expressed previously annotated 
transcripts. F, female; M, male. b, Scatter plot of sex-biased gene expression. 
Light red, female-biased annotated (n = 960); dark red, female-biased NTRs 
(n = 12); light blue, male-biased annotated (n = 2,401); dark blue, male-biased 
NTRs (n = 431); light grey, unbiased annotated (n = 8,217); black, unbiased 
NTRs (n = 136). c, Genome coverage. For each developmental sample, the 
short poly(A)* reads were used to estimate the percentage of the genome 
covered using a cutoff of two reads. The mature and primary transcripts were 
inferred for the previously FlyBase 5.12 (dotted lines) and modENCODE (solid 
lines) gene models. 


with the known transcriptional complexity of the testis’. We observed 
that 40% of expressed genes are constitutively expressed in 30 samples 
(Supplementary Fig. 7b). We also observed developmentally regulated 
expression of TEs (Supplementary Materials and Supplementary Fig. 8). 

We observed pronounced expression changes in over 1,500 genes 
in the first two third instar larval samples (Supplementary Fig. 7a, c). 
Expression of 1,199 genes increased at least tenfold, and 421 genes 
decreased at least tenfold (Supplementary Table 23). Nearly all of the 
upregulated genes are expressed for the first time during the third 
instar stage and most are poorly characterized genes. 

The earliest known event in metamorphosis is the ‘mid-3rd transi- 
tion’, identified by the synchronous changes in the transcription of a 
number of well studied genes, Ecdysone-induced protein 28/29kD and 
Fat body protein 1 (reviewed in ref. 31), and the switch from proximal 
to distal promoters of Alcohol dehydrogenase”. These markers coincide 
with the surge reported here. The mid-3rd transition has no morpho- 
logical or behavioural correlates and is associated with a pulse of the 
steroid hormone ecdysone** acting through a non-standard receptor™. 
Whether the onset of testis development is a consequence of the mid- 
3rd transition, or whether the two events are functionally related, 
remains to be investigated. 

Over 29% of protein-coding genes showed significant sex-biased 
expression in adults (false discovery rate <0.1%), with more male- 
biased (1,829) or male-specific genes (572) than female-biased (945) 
or female-specific genes (15) (Supplementary Tables 24 and 25, and 
Fig. 3b). Known female (ovo and otu) and male (dj) sex-biased genes 
were expressed as expected. We found that 74% of the NTRs expressed 
in adults were significantly male-biased whereas only 2.1% were sig- 
nificantly female-biased. 
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Table 1 | Classification of alternative splicing events 


Splicing event Diagram FlyBase r5.12 modENCODE New events Short poly(A)” RNA-Seq Significantly changing 
Cassette exons meee 793 2,717 2,014 2,369 1,539 
Alternative 5’ splice sites —= 843 5,192 4,599 4,583 3,142 
Alternative 3’ splice sites =< 879 6,253 5,505 5,579 3,242 

Mutually exclusive exons meee 229 251 123 228 226 
Coordinate cassette exons meee 301 1,227 979 992 467 
Alternative first exons —_ 1,767 4,936 3,442 4,473 3,996 
Alternative last exons ma 227 604 432 553 471 
Retained/unprocessed introns SS 1,434 2,679 (5,667) 1,275 (4,263) 2,439 (35,641) 868 (8,998) 
Total 6,437 23,859 (26,847) 18,369 (21,478) 21,216 (54,418) 13,951 (22,081) 


The number of retained/unprocessed introns in parentheses indicates the total number identified, whereas the number not in parentheses indicates the subset of identified events that have been validated by 


cDNA sequences or FlyBase 5.12 annotations. 


Genome coverage 


Mature mRNAs are encoded by 20% of the D. melanogaster genome 
and primary transcripts by 60% (Fig. 3c). An additional 15% of the 
genome (~75% total) is detected when considering all of the short 
poly(A) * RNA-Seq data. However, as greater than 99% of the reads 
map within the bounds of the transcript models, the reads that map to 
intergenic regions constitute a small minority of our data. Thus, 
although pervasive transcription of mammalian genomes has been 
observed in microarray studies*, we found little evidence of such 
‘dark matter’*® (that is, pervasive transcription) in D. melanogaster. 


Discovery and dynamics of alternative splicing 

To characterize constitutive and alternative splicing, we identified 
71,316 splice junctions, of which 22,965 were new discoveries. Of the 
new splice junctions, 26% were supported by multiple experimental 
data types and 74% by only one data type, (Supplementary Fig. 9a) 
primarily short poly(A)* RNA-Seq. Of the 20,751 new junctions from 
the short poly(A)" RNA-Seq data, 7,833 were incorporated into new 
transcript models or transcribed regions (NTRs). The remaining new 
junctions have yet to be incorporated into transcript models. 

We also identified a total of 102,026 exons (Supplementary Table 
26). Of the 52,914 representing new and revised exons, 65% were 
validated by capture and sequencing of cDNAs and 2,586 were sup- 
ported by RNA-Seq data from D. mojavensis and D. pseudoobscura. 
Of the new exons, 3,392 were identified from the new splice junctions 
but have yet to be incorporated into transcript models. 

To examine splicing dynamics throughout development, we cate- 
gorized all splicing events into the common types of alternative splicing 
events (Table 1). We identified a total of 23,859 splicing events, of 
which 18,369 were new or recategorized, a threefold increase from 


annotated splicing events. An additional 2,988 retained/unprocessed 
introns were identified that were supported by only one experimental 
data type. In all, 7,473 genes contain at least one alternative splicing 
event, which is 60.7% of the 12,295 expressed multi-exon genes—also a 
threefold increase in the fraction of genes with alternatively spliced 
transcripts. Although smaller than the fraction of human genes with 
alternatively spliced transcripts (95%)°’**, a larger proportion of 
Drosophila genes encode alternative transcripts than was previously 
known. 

Of the new alternative exons, 8,226 were previously annotated as 
constitutive. As observed’, annotated cassette exons, and their flank- 
ing introns, are more highly conserved than annotated constitutive 
exons (Fig. 4a). The newly discovered cassette exons are more highly 
conserved than the new constitutive exons, although both classes are 
less conserved than the corresponding class of annotated exons. New 
cassette exons that were previously annotated as constitutive exons 
are the most highly conserved set of exons (Fig. 4a). Annotated and 
new cassette exons show a strong tendency to preserve reading frame 
(Supplementary Fig. 9b), indicating that these transcripts increase 
protein diversity. Both annotated and new cassette exons tend to be 
shorter than their constitutive counterparts, although both sets of new 
exons tend to be shorter than annotated exons. 

To assess the extent of splicing variation we calculated the ‘per cent 
spliced in’ or Y (ref. 38) for each splicing event in each sample as well as 
the switch score (A) by determining the difference between the highest 
and lowest Y values across development (AY = Pinax — Yimin). This 
revealed a very smooth distribution of AY among all events, indicating 
that the splicing of most exons is fairly constant whereas only a minority 
change markedly (Supplementary Fig. 9c and Supplementary Table 27). 
Only 831 splicing events have a AY value >90. Further statistical 
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Figure 4 | Developmentally regulated splicing events. a, Conservation of 
internal constitutive and cassette exons >50 nucleotides that were annotated or 
new discoveries. (Annotated constitutive, n = 26,127; annotated cassette, 

n = 438; modENCODE cassette, n = 173; modENCODE constitutive, n = 306; 
FlyBase 5.12 constitutive to modENCODE cassette, n = 304.) b, Clusters of 


4 | NATURE | VOL 000 | 00 MONTH 2010 


regulated cassette exon events during development. The scale bar indicates 
Z-scores of ¥. c, Regulated alternative splicing in CadN during embryogenesis. 
The maximal number of reads in the poly(A)” RNA-Seq data are indicated for 
each track 
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analyses (see Supplementary Methods) identified 13,951 (66%) alterna- 
tive splicing events that change significantly throughout development 
(Supplementary Table 28). 

Hierarchical clustering of cassette exon events revealed the dynamic 
nature of splicing throughout development (Fig. 4b), as exemplified by 
Cadherin-N (CadN), a gene with three sets of mutually exclusive exons 
(Fig. 4c). In each set, one exon is preferentially included in early embryos, 
the other in late embryos, with a smooth transition between the two. Our 
analysis also identified groups of exons that have coordinated splicing 
patterns (Fig. 4b). A set of 55 genes contain exons that are preferentially 
included in early embryos, late larvae, early pupae and females but 
skipped in all other stages. Gene Ontology (GO) analysis of these genes 
indicates that many encode proteins involved in epithelial cell-to-cell 
junctions. GO analysis of genes that contain exons preferentially 
included during late pupal and adult stages indicates that many encode 
proteins that are part of neuronal synapses. 


Sex-biased alternative splicing 


Sex determination in Drosophila is mediated by a cascade of regulated 
alternative splicing events involving Sex lethal (Sxl), transformer (tra), 
male-specific lethal 2 (msl-2), doublesex (dsx) and fruitless (fru) that 
specify nearly all physical and behavioural dimorphisms between 
males and females as well as X chromosome dosage compensation”. 
Our RNA-Seq data confirm sex-biased (AY =|Pmate— Wrematel) 
splicing of Sxl (AY = 89.6), tra (AV = 39.2), dsx (AW = 59.7) and 
fru (A¥ = 100). 

In addition to the canonical sex-determination cascade, we iden- 
tified 119 strongly sex-biased splicing events (AY > 70) (Supplemen- 
tary Fig. 9d). One striking example is Reps, which was annotated as 
containing six constitutive exons. RNA-Seq data indicate that exon 
five is a sex-biased alternative cassette exon (AY = 73.39) (Supplemen- 
tary Fig. 10). This highly conserved exon is included in males and 
skipped in females. The intron upstream of this cassette exon contains 
conserved SXL binding sites, indicating that it is regulated by SXL and is 
a candidate sex differentiation gene. 


Discovery of RNA editing sites 

Previous studies identified 127 sites in 55 Drosophila genes that 
undergo A-to-I RNA editing*'. This post-transcriptional modifica- 
tion is catalysed by dADAR, which is expressed at increasing levels 
throughout development and is thought to target products involved in 
nervous system function. We analysed the poly(A) * RNA-Seq data to 
identify exonic nucleotide positions consistent with A-to-I editing 
and defined 972 edited positions within transcripts of 597 genes, 
including previously described edited sites in the transcripts of 36 
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genes (Supplementary Table 29). These genes include those required 
for rapid neurotransmission and other widely ranging functions. For 
most sites, the frequency of editing increases throughout development 
and does not correlate with overall expression levels (Fig. 5a). Editing 
typically begins in late pupal stages, although we find transcripts that 
seem to be edited in late embryogenesis. Consistent with earlier studies”, 
exons containing editing sites are more highly conserved than unedited 
exons. The majority of the edited positions (630) alter amino acid cod- 
ing, the others are either silent (201) or within untranslated regions 
(141). For example, the transcripts of quiver (qvr) are edited at six posi- 
tions, four that result in amino acid changes (Fig. 5b). qvr encodes a 
potassium channel subunit that modulates the function of the voltage- 
gated Shaker (SH) potassium channel. Sh transcripts are also edited 
at multiple positions*. The combinatorial editing of both proteins 
probably has an important role in modulating action potentials in 
the arthropod nervous system and may have implications for the regu- 
lation of sleep**. Expressed sequence tags, long poly(A) * RNA-Seq and 
cDNAs cross-validate nearly one-quarter (214) of the newly discovered 
sites. 

Computational analysis identified three potential editing-associated 
sequence motifs (Fig. 5a). We observe 381 sites with one or more motifs 
in close proximity to the edited nucleotide (Supplementary Table 30). 
Motif C, although less common than motifs A and B, is more strongly 
associated with the editing site. Most (93%) instances of motif C occur 
on the sense strand of the transcript and the A at the 3’ end of the motif 
is the edited nucleotide. This motif is over-represented in editing 
events that occur early in development. 


Discussion 


Our interrogation of the transcriptome of D. melanogaster throughout 
development has considerably expanded the number of building 
blocks used to make a fly. Specifically, we identified nearly 2,000 
NTRs, increased the number of alternative splicing events by threefold 
and the number of RNA editing sites by an order of magnitude. The 
resulting view of the transcriptome at single-base resolution markedly 
improves our understanding of expression dynamics throughout the 
Drosophila life cycle and has substantial biological implications. 

The D. melanogaster, C. elegans and human genomes are organized 
quite differently. Specifically, 20%, 45% and 2.5% of the D. melanogaster, 
C. elegans and human genomes, respectively, encode exons or mature 
transcripts. Primary transcripts comprise a larger fraction of each 
genome—60%, 82% and 37%. This highlights the fact that primary 
transcripts and introns are much shorter in D. melanogaster and C. 
elegans than in human and that the D. melanogaster and C. elegans 
genomes are more compact than the human genome. 
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The existence of unannotated genes was indicated by microarray 
studies**° and conservation among Drosophilid genomes”. However, 
the NTRs that we identified were not identified by comparative sequence 
analysis*° as they are less conserved than most previously known genes. 
This emphasizes the importance of using both comparative analyses and 
transcriptome profiling for genome annotation. 

Despite the depth of our sequencing, the annotation of the D. mela- 
nogaster transcriptome is not finished. We failed to detect expression of 
1,488 annotated genes including members of gene families to which 
short reads can not be uniquely mapped and genes expressed at low 
levels or in spatially and temporally restricted patterns. Moreover, 
although we substantially increased the fraction of genes that encode 
alternatively spliced or edited transcripts, we again failed to detect several 
annotated RNA processing events. Study of more temporally and spa- 
tially restricted samples will allow deeper exploration of the Drosophila 
transcriptome, and almost certainly result in the discovery of yet addi- 
tional features. Furthermore, functional studies of the new and previ- 
ously unstudied elements will provide valuable insight into metazoan 
development. 


METHODS SUMMARY 


Animal staging, collection and RNA extraction. Isogenic (y'; cn bw’ sp’) 
embryos were collected at 2-h intervals for 24 h. Collection of later staged animals 
started with synchronized embryos and included resynchronizing with appro- 
priate age indicators. Six larval, six pupal and three adult sexed stages, 1, 5 and 30 
days, were collected. RNA was isolated using TRIzol (Invitrogen), DNased and 
purified on an RNAeasy column (Qiagen). poly(A)" RNA was prepared from an 
aliquot of each total RNA sample using an Oligotex kit (Qiagen). 

Tiling arrays. RNAs from three biological replicates of each sample were inde- 
pendently hybridized on 38-bp arrays (Affymetrix GeneChip Drosophila Tiling 
2.0R array) as described”. 

RNA-Seq. Libraries were generated and sequenced on an Illumina Genome 
Analyser IIx using single or paired-end chemistry and 76-bp cycles. SOLiD sequen- 
cing used total RNA treated with the RiboMinus Eukaryote Kit (Invitrogen). 
Samples were fragmented, adaptors ligated (Ambion) and sequenced for 50 bases 
using SOLID V3 chemistry. 454 sequencing used poly(A)” RNA from Oregon R 
adult males and females and mixed-staged y’; cn bw’ sp’ embryos. Sequences are 
available from the Short Read Archive and the modENCODE website (http:// 
www.modencode.org/). 

Targeted RT-PCR and cDNA isolation and sequencing. Standard procedures 
were used for RT-PCR and targeted cDNA isolation and sequencing. 

Analysis. Cufflinks** was used to identify new transcript models and to calculate 
expression levels for annotated and predicted transcript models. MFold** was 
used to predict secondary structures from the new snoRNA-like RNAs. 
JuncBASE” identified alternative splicing events and calculated per cent spliced 
in (Y)°*. Editing sites were identified by comparing aligned reads to the reference 
genome. 
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Comprehensive analysis of the chromatin 
landscape in Drosophila melanogaster 


Peter V. Kharchenko!?, Artyom A. Alekseyenko*“, Yuri B. Schwartz°+, Aki Minoda®, Nicole C. Riddle’, Jason Ernst*”’, 

Peter J. Sabo!®, Erica Larschan**"', Andrey A. Gorchakov**, Tingting Gu’, Daniela Linder-Basso°+, Annette Plachetka**, 
Gregory Shanower’+, Michael Y. Tolstorukov'?, Lovelace J. Luquette’, Ruibin xi, Youngsook L. Jung’, Richard W. Park)”, 
Eric P. Bishop’, Theresa P. Canfield'®, Richard Sandstrom’®, Robert E. Thurman’®, David M. MacAlpine®, 

John A. Stamatoyannopoulos’*"“, Manolis Kellis®’, Sarah C. R. Elgin’, Mitzi I. Kuroda**, Vincenzo Pirrotta®, Gary H. Karpen®* 


& Peter J. Park!*3* 


Chromatin is composed of DNA and a variety of modified histones and non-histone proteins, which have an impact on cell 
differentiation, gene regulation and other key cellular processes. Here we present a genome-wide chromatin landscape 
for Drosophila melanogaster based on eighteen histone modifications, summarized by nine prevalent combinatorial 
patterns. Integrative analysis with other data (non-histone chromatin proteins, DNase I hypersensitivity, GRO-Seq 
reads produced by engaged polymerase, short/long RNA products) reveals discrete characteristics of chromosomes, 
genes, regulatory elements and other functional domains. We find that active genes display distinct chromatin 
signatures that are correlated with disparate gene lengths, exon patterns, regulatory functions and genomic contexts. 
We also demonstrate a diversity of signatures among Polycomb targets that include a subset with paused polymerase. This 
systematic profiling and integrative analysis of chromatin signatures provides insights into how genomic elements are 
regulated, and will serve as a resource for future experimental investigations of genome structure and function. 


The model organism Encyclopedia of DNA Elements (modENCODE) 
project is generating a comprehensive map of chromatin components, 
transcription factors, transcripts, small RNAs and origins of replication 
in Drosophila melanogaster and Caenorhabditis elegans'*. Drosophila 
has been used as a model system for over a century to study chro- 
mosome structure and function, gene regulation, development and 
evolution. The availability of high-quality euchromatic and heterochro- 
matic sequence assemblies**, extensive annotation of functional ele- 
ments®, and a vast repertoire of experimental manipulations enhance 
the value of epigenomic studies in Drosophila. 

Genome-wide profiling of chromatin components provides a rich 
annotation of the potential functions of the underlying DNA sequences. 
Previous work has identified patterns of post-translational histone modi- 
fications and non-histone proteins associated with specific elements (for 
example, transcription start sites, enhancers), as well as delineating the 
transcriptional status of genes and large domains’*. Here we present a 
comprehensive picture of the chromatin landscape in a model eukaryotic 
genome. We define combinatorial chromatin ‘states’ at different levels of 
organization, from individual regulatory units to the chromosome level, 
and relate individual states to genome functions. 


Combinatorial chromatin states 


We performed chromatin immunoprecipitation (ChIP)-array ana- 
lysis for numerous histone modifications and chromosomal proteins 


(Supplementary Table 1), using antibodies tested for specificity and 
cross-reactivity’ (Supplementary Fig. 1). Here we describe analyses of 
cell lines S2-DRSC (S2) and ML-DmBG3-c2 (BG3), derived from late 
male embryonic tissues (stages 16-17) and the central nervous system 
of male third instar larvae, respectively (see http://www.modencode.org 
for data from other cell lines and animal stages). Analysis reveals groups 
of correlated features, including those associated with heterochromatic 
regions’’, Polycomb-mediated repression"’, and active transcription” 
(Supplementary Fig. 2), similar to those observed in other organisms'?*. 
This indicates that specific histone modifications work together to 
achieve distinct chromatin ‘states’. 

We used a machine-learning approach to identify the prevalent 
combinatorial patterns of 18 histone modifications, capturing the over- 
all complexity of chromatin profiles observed in $2 and BG3 cells with 9 
combinatorial states (Fig. la, Methods). The model associates each 
genomic location with a particular state, generating a chromatin-centric 
annotation of the genome (Fig. 1b). We examined each state for enrich- 
ment in non-histone proteins (Fig. la and Supplementary Fig. 3) and 
gene elements, as well as distribution across the karyotype (Fig. 1b and 
Supplementary Fig. 4) and finer-scale levels (Fig. 1c-e). 

Most distinct chromatin states are associated with transcriptionally 
active genes. Active promoter and transcription start site (TSS)-proximal 
regions are identified by state 1 (Fig. 1; red), marked by prominent 
enrichment in H3K4me3/me2 (tri/dimethylation of residue K4 of 
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Figure 1 | Chromatin annotation of the Drosophila melanogaster genome. 
a, A 9-state model of prevalent chromatin states found in $2 and BG3 cells. Each 
chromatin state (row) is defined by a combinatorial pattern of enrichment (red) 
or depletion (blue) for specific chromatin marks (first panel, columns; active 
marks in green, repressive in blue). For instance, state 1 is distinguished by 
enrichment in H3K4me2/me3 and H3K9ac, typical of transcription start sites 
(TSS) in expressed genes. The enrichments/depletions are shown relative to 
chromatin input (S2 data shown, see Supplementary Fig. 3 for BG3 data and 
histone density normalization). The second panel shows average enrichment of 
chromosomal proteins. The third panel shows fold over/under-representation 
of genic and TSS-proximal (+1 kb) regions relative to the entire tiled genome. 
The enrichment of intronic regions is relative to genic regions associated with 
each state. b, A genome-wide karyotype view of the domains defined by the 
9-state model in S2 cells. Centromeres are shown as open circles, and dashed 


histone H3) and H3K9ac (acetylation of K9 of histone H3). The tran- 
scriptional elongation signature associated with H3K36me3 enrichment 
is captured by state 2 (purple), found preferentially over exonic regions of 
transcribed genes. State 3 (brown), typically found within intronic 
regions, is distinguished by high enrichment in H3K27ac, H3K4mel 
and H3K18ac. A related chromatin signature is captured by state 4 
(coral), distinguished by enrichment of H3K36mel, but notably lacking 
H3K27ac. The number of genes associated with each chromatin state and 
the distribution of states within genes are shown in Supplementary Fig. 5. 

Several aspects of large-scale organization are revealed by the karyotype 
view (Fig. 1b). Chromosome X is markedly enriched for state 5 (green), 
distinguished by high levels of H4K16ac in combination with H3K36me3 
and other marks of ‘elongation’ state 2 (a combinatorial pattern associated 
with dosage compensation in male cells'*). Pericentromeric heterochro- 
matin domains and chromosome 4 are characterized by high levels of 
H3K9me2/me3 (state 7, dark blue)'®. Finally, the model distinguishes 
another set of heterochromatin-like regions containing moderate levels 
of H3K9me2/me3 (state 8, light blue; Fig. le). Surprisingly, this state 
occupies extensive domains in autosomal euchromatic arms in BG3 cells, 
and in chromosome X in both cell lines'®. 

Further aspects of chromatin organization can be visualized by folding 
the chromosome using a Hilbert curve (Fig. 2a)'’, which maintains the 


2 | NATURE | VOL 000 | 00 MONTH 2010 


lines span gaps in the genome assembly. Several prominent chromatin 
organization features are illustrated (colour code in a), including the extent of 
pericentromeric heterochromatin (state 7) and the H4K16ac-driven signature 
of the dosage-compensated male X chromosome (state 5). (BG3 in 
Supplementary Fig. 4.) c-e, Examples of chromatin annotation at specific loci. 
c, Two distinct chromatin signatures of transcriptionally active genes: one (left) 
is associated with enrichment in marks of states 3 and 4, whereas the other 
(right) is limited to states 1 and 2, recapitulating well established TSS and 
elongation signatures (note that small patches of state 7 in CG13185 illustrate 
H3K9me2 found at some expressed genes in S2 cells'®). d, A locus containing 
two Polycomb-associated domains, silent (left) and balanced (right). e. A large 
state 8 domain located within euchromatic sequence in BG3 cells, enriched for 
chromatin marks typically associated with heterochromatic regions, but at 
lower levels than in pericentromeric heterochromatin (state 7). 


spatial proximity of nearby elements. Thus, local patches of correspond- 
ing colours reveal the sizes and relative positions of domains associated 
with particular chromatin states (Fig. 2b and Supplementary Figs 6-9). 
For instance, specks of TSS-proximal regions (state 1) are typically con- 
tained within larger blocks of transcriptional elongation marks (state 2), 
which are in turn encompassed by extensive patches of H3K36mel- 
enriched domains (state 4) and variable-sized blocks of state 3. The 
clusters of open chromatin formed by these gene-centric patterns are 
separated by extensive silent domains (state 9) and regions of Polycomb- 
mediated repression (state 6). Factors responsible for domain bound- 
aries were not identified in our analysis (Supplementary Fig. 10). 

We also developed a multi-scale method to characterize chromatin 
organization at the spatial scale appropriate for the genome properties 
being investigated. For example, we observe that chromatin patterns most 
accurately reflect the replication timing of the S2 genome at scales of 
~170 kb (Supplementary Information, section 1). This is consistent with 
size estimates of chromatin domains influencing replication timing"’, and 
suggests that multiple replication origins are coordinately regulated by the 
local chromatin environment (each replicon is ~28-50 kb’). 

To examine combinatorial patterns not distinguished by the simplified 
9-state model, we also generated a 30-state combinatorial model that uses 
presence/absence probabilities of individual marks”? (Supplementary 


©2010 Macmillan Publishers Limited. All rights reserved 


Chromosome 3L 


soueb pesseidxe 
|Jews Jo 1983sN|D 


urewop 
ulyewiouyo uedOQ 


surewop 
Dod 


UeWOP 2] 
-UIJEWWOYOOI818} 


ulyeosyoose}9y 
OU@WOIUSDJ8q 


Chromatin states: i} 


Figure 2 | Visualization of spatial scales and organization using compact 
folding. a, The chromosome is folded using a geometric pattern (Hilbert space- 
filling curve) that maintains spatial proximity of nearby regions. An illustration 
of the first four folding steps is shown. Note that although this compact curve is 
optimal for preserving proximity relationships, some distal sites appear adjacent 
along the fold axis (green dots). b, Chromosome 3L in S2 cells. A domain of a 
given chromatin state appears as a patch of uniform colour of corresponding 
size. Thin black lines are used to separate regions that are distant on the 
chromosome. The folded view illustrates chromatin organization features that 
are not easily discerned from a linear view: active TSSs (state 1) appear as small 
specks surrounded by elongation state 2, commonly next to larger regions 
marked by H3K36mel -driven state 4, which also contains patches of intron- 
associated state 3. These open chromatin regions are separated by extensive 
domains of state 9. See Supplementary Figs 6 and 7 for other chromosomes and 
BG3 data. The folded views can be browsed alongside the linear annotations and 
other relevant data online: http://compbio.med.harvard.edu/flychromatin. 


Fig. 11). The increased number of states can identify finer variations that 
are biologically significant, for example, a signature corresponding to 
transcriptional elongation in heterochromatic regions"®. 


Chromatin state variation among genes 

Active genes generally display enrichments or depletions of individual 
marks at specific gene segments (Fig. 3a). When classified according 
to their chromatin signatures (Supplementary Fig. 12), active genes 
fall into subclasses correlated with expression magnitude (Sup- 
plementary Information, section 2), gene structure and genomic 
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Figure 3 | Chromatin patterns associated with transcriptionally active 
genes. a, Location and extent of chromatin features relative to boundaries of 
expressed genes (=1 kb) in BG3 cells. The colour intensity indicates the relative 
frequency of enrichment/depletion (red/blue) of a given mark within the gene 
(normalized independently for each mark). b, Regions enriched for ‘active’ 
chromatin marks in long transcribed genes. The plot shows the extent of regions 
enriched for various active marks at transcriptionally active genes (=4 kb) on 
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context (for example, heterochromatic genes combine H3K9me2/ 
me3 with some active marks)'®. Of particular interest is one class of 
long expressed genes, many with regulatory functions, which are 
enriched for H3K36mel (cluster 2, Supplementary Fig. 12; 131 genes 
in $2, 202 in BG3; Supplementary Table 2). 

To examine further the patterns associated with long genes, we 
clustered expressed autosomal genes =4 kb based on blocks of enrich- 
ment for each chromatin mark (Fig. 3b; 1,055 genes). We observe that 
genes with large 5’-end introns (green subtree, Fig. 3b; 552 genes) 
show extensive H3K27ac and H3K18ac enrichment, broader H3K9ac 
domains, and blocks of H3K36mel enrichment (chromatin state 3, 
Fig. 3b, last column). These genes are enriched for developmental and 
regulatory functions (Supplementary Table 3), and are positioned 
within domains of Nipped-B”' (Fig. 3b), a cohesin-complex loading 
protein previously associated with transcriptionally active regions*’”’. 
In contrast, genes with more uniformly distributed coding regions 
(red subtree, Fig. 3b) lack most state 3 marks, and H3K9ac enrichment 
is restricted to the 2 kb downstream of the TSS. These differences are 
not explained by variation in histone density (Supplementary Fig. 13). 
Overall, the presence or absence of state 3 is the most common dif- 
ference in the chromatin composition of expressed genes that are 1 kb 
and longer (Supplementary Fig. 14), and the presence of state 3 con- 
sistently correlates with a reduced fraction of coding sequence in the 
gene body, mainly associated with the presence of a long first intron. 

State 3 domains are highly enriched for specific chromatin remodelling 
factors (SPT 16 (also known as DRE4) and dMI-2; Supplementary Figs 15 
and 16), whereas state 1 regions around active TSSs are preferentially 
bound by NURF301 (also called E(bx)) and MRG15. ISW1 is enriched in 
both states 1 and 3 (Supplementary Figs 16 and 17). State 3 domains also 
exhibit the highest levels of nucleosome turnover”, and show higher 
enrichment of the transcription-associated H3.3 histone variant™ than 
either the TSS- or elongation-associated states 1 and 2 (Supplementary 
Figs 15 and 16). Consistent with earlier analyses of cohesin-bound 
regions”, state 3 sequences tend to replicate early in G1 phase, and show 
abundance of early replicating origins (Supplementary Fig. 18). A regu- 
latory role for state 3 domains is suggested by enrichment for a known 
enhancer binding protein (dCBP/p300”*) in adult flies, and for enhancers 
validated in transgene constructs” (Supplementary Fig. 19). 


Modes of regulation in Polycomb domains 

In Drosophila, loci repressed by Polycomb group (PcG) proteins are 
embedded in broad H3K27me3 domains that are regulated by 
Polycomb response elements (PREs) bound by E(Z), PSC and dRING 
(Fig. 1d)**”’. We find that regions of H3K4mel enrichment surround all 
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BG3 autosomes. Each row represents a scaled gene. The first column illustrates 
coding exons; the last column shows chromatin state annotation. The clustering 
of the genes according to the spatial patterns of chromatin marks separates 
genes with a high fraction of coding sequence (red subtree, bottom) from genes 
containing long introns (green subtrees, top), which are associated with 
chromatin state 3 (last column) and binding of specific chromosomal proteins, 
such as Nipped-B’' (also see Supplementary Fig. 13). 
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PREs, 90% of which also display narrower peaks of H3K4me2 enrichment 
(Supplementary Fig. 20). Although this pattern is reminiscent of transcrip- 
tionally active promoter regions, PREs lack H3K4me3, indicating that a 
different mechanism of H3K4 methylation is used, perhaps involving the 
Trithorax H3K4 histone methyltransferase (HMTase) found at all PREs”. 

To examine chromatin states associated with PcG targets, we analysed 
the chromatin and transcriptional signatures of TSSs in Polycomb-bound 
domains (Fig. 4a and Supplementary Fig. 21). In addition to fully repressed 
TSSs (cluster 1, Fig. 4a), we identify TSSs maintained in the ‘balanced’ 
state” (cluster 2, Fig. 4a), distinguished by coexistence of Polycomb with 
active marks (including the HMTase ASH1) and production of full-length 
messenger RNA transcripts (for example, Psc domain, Fig. 1d). 

TSSs in clusters 3 and 4 are distinguished by the presence of adja- 
cent PREs (Fig. 4a). Surprisingly, 53% of the PRE-proximal TSSs 
produce short RNA transcripts”? (cluster 3, Fig. 4a), indicating stalling 
of engaged RNA pol II°°. Using the global run-on sequencing (GRO- 
Seq) assay to accurately assess engaged RNA polymerases*', we 
observe that cluster 3 TSSs produce short transcripts in the sense 
orientation. The level of GRO” signal is similar to that found at fully 
transcribed genes (Supplementary Fig. 22); thus, some transcription 
initiates in cluster 3, but elongation fails. Interestingly, these genes are 
enriched for regulatory and developmental functions, even more than 
other genes within Polycomb domains (see Supplementary Tables 4 
and 5). Genes without TSS-proximal PREs generally lack short tran- 
script signatures (for example, cluster 1 in Fig. 4a; see Supplementary 
Fig. 21 for exceptions). Importantly, engaged polymerases and tran- 
scripts are not a general feature of PREs; TSS-distal PREs typically lack 
short RNA and GRO-Seq signals (Fig. 4b and Supplementary Fig. 22) 
despite being similarly enriched in H3K4mel1/me2. The striking link 
between TSS-proximal PREs and the production of short RNAs sug- 
gests a potential mechanism for control of these developmental regu- 
latory genes, whereby the same features that recruit H3K4 methyl marks 
to PREs may also facilitate RNA pol II recruitment to nearby TSSs. 


DHS plasticity and chromatin states 


We used a DNase I hypersensitivity assay**”? to examine the distributions 
of putative regulatory regions and their relationships with chromatin 
states. DNase I hypersensitivity mapping broadly identifies sites with 
low nucleosome density and regions bound by non-histone proteins**”*. 
Short-read sequencing identified 8,616 high-magnitude DNase I hyper- 
sensitive sites (DHSs) in S2 cells and 6,354 in BG3 cells (and a com- 
parable number of low-magnitude DHSs; Supplementary Fig. 23 and 
Methods). Approximately half of the high-magnitude DHSs are found at 
transcriptionally active TSSs (Supplementary Fig. 24). Thus, the chro- 
matin context of the TSS-proximal DHSs is dominated by the features 


ASH1 H3K27me3 H3K4me1 H3K4me2 


expected for an active TSS, including RNA pol II, H3K4me3 and other 
state 1 marks (clusters 1, 2; Fig. 5a and Supplementary Fig. 25). 

Of the 36% TSS-distal DHSs, most (60%) are positioned within 
annotated expressed genes (Supplementary Fig. 24). These gene-body 
DHSs are distinguished from TSS-proximal DHSs by low H3K4me3, 
higher levels of H3K4me1, H3K27ac, and other marks linked to chro- 
matin state 3 (clusters 3, 4; Fig. 5a and Supplementary Fig. 26). An 
additional 20% of the TSS-distal DHSs are outside of annotated genes, 
but show signatures associated with active transcription starts or 
elongation, suggesting new alternative promoters or unannotated 
genes (Supplementary Figs 27 and 28). The remaining 20% of TSS- 
distal DHSs that appear to be intergenic (6% of all DHSs) are typically 
enriched for H3K4mel1, but lack other active marks (cluster 5, Fig. 5a). 

Most DHS positions fall into the TSS-proximal state 1 or the intron- 
biased state 3 (Fig. 5b). State 3 lacks H3K4me3 and is enriched for 
H3K4mel, H3K27ac and H3K18ac, similar to mammalian enhancer 
elements*’. Many state 3 DHS positions are occupied by known regulatory 
proteins: GAGA factor binds to 49% of these DHSs in S2 cells, and 
developmental transcription factors bind to 44% of these DHSs in 
embryos”. Notably, we find that TSS-distal DHSs in Drosophila exhibit 
low-level bi-directional transcripts (Fig. 5a, shortRNA panel; see also Sup- 
plementary Figs 29 and 30), analogous to the enhancer RNAs (eRNAs) 
characterized in mice*’. Analysis of GRO-Seq data (Fig. 5e) indicates that 
eRNA-like transcripts are common to both intra- and intergenic TSS- 
distal DHSs in Drosophila, a feature that is conserved with mammals. 

The association of DHSs with chromatin states 1 and 3 (Fig. 5c) 
persists even in chromosome 4 and pericentromeric heterochromatin, 
where such states are infrequent (Supplementary Fig. 31). This suggests 
that these chromatin states and associated remodelling factors (for 
example, ISWI, SPT16) provide the context necessary for non-histone 
chromosomal protein binding at DHSs, or are the consequence of such 
binding events. To investigate this interdependency, we analysed a 
high-confidence set of loci that exhibit DHSs in only one of the two 
examined cell lines (Supplementary Fig. 32). Surprisingly, although in 
general more DHSs are in state 1 regions, 91% of the cell-type-specific 
DHSs are found within state 3 domains (14-fold increase compared to 
state 1 DHSs; Supplementary Table 6 and Fig. 5d). Comparison with 
DHSs in an additional cell type (Kc167, Supplementary Fig. 33) con- 
firms that DHSs displaying plasticity between cell types are mostly 
found in state 3. When DHSs are absent, the altered loci maintain 
chromatin state 3 in 23% of the cases (Fig. 5d), indicating that the 
presence of state 3 is not always dependent on the DHS. More fre- 
quently, the altered loci transition to state 4 (43% of the cases), an open 
chromatin state that lacks many of the histone modifications and chro- 
matin remodellers characteristic of state 3. Although the less frequent 
H3k4me3 RNA pol i GRO* 
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Figure 4 | Signatures of TSSs within domains of Polycomb-mediated 
repression. a, Distinct classes of TSSs in S2 cell Polycomb domains. Each row 
represents a TSS. Clusters 1-5 illustrate distinct TSS states (see Supplementary 
Fig. 21 for complete set of clusters). Cluster 1 shows fully repressed TSSs with 
the expected pattern of PC and H3K27me3 enrichment; cluster 2 shows 21 TSSs 
found within ASH1 domains, maintained in a balanced state. Clusters 3 and 4 
distinguish TSSs located in the immediate proximity of Polycomb response 


4 | NATURE | VOL 000 | 00 MONTH 2010 


. — = t 
log, enrichment: === 
2 0 2 


1 
== 


elements (PREs), showing the symmetrical H3K4me1/me2 enrichment typical 
of all PREs. Many such TSSs (cluster 3, 42 TSSs) produce short, non- 
polyadenylated transcripts along the sense strand (GRO*/shortRNA* 
columns), indicating the presence of paused polymerase. b, PRE positions 
distant from annotated TSSs. TSS-distal PREs exhibit enrichment for 
H3K4mel/me2, but are not associated with GRO or shortRNA signatures. 
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Figure 5 | Chromatin signatures of regulatory elements identified by DNase 
I hypersensitivity. a, Representative classes of high-magnitude DNase I 
hypersensitive sites (DHSs) and chromatin signatures in S2 cells. TSS-proximal 
(within 2 kb) DHSs show chromatin signatures expected of expressed gene 
promoters: high H3K4me3 and RNA pol II signal extending in the direction of 
transcription (left to right; cluster 2 groups bi-directional promoters). TSS- 
distal DHSs are associated with high H3K4mel1 and low H3K4me3 levels. Most 
TSS-distal DHSs found within the bodies of expressed genes (clusters 3, 4) are 
associated with chromatin state 3. A cluster of rare intergenic DHSs (cluster 5) 
is associated with localized peaks of H3K4me1/2 (complete sets of clusters in 
Supplementary Figs 25, 26 and 28). b, Distribution of DHS positions among 
chromatin states. The vast majority of DHSs are found within the TSS-proximal 
state 1 or enhancer-like state 3 regions. c, States 1 and 3 exhibit the highest 


transitions to the Polycomb state 6 (7%) or background state 9 (17%) 
typically coincide with gene silencing, most of the genes that maintain 
state 3 or transition to state 4 remain transcriptionally active (Supplemen- 
tary Fig. 34). These observations provide further support for an enhancer- 
like function for state 3 DHSs, and suggest a more subtle regulatory role 
than simple linkage to the presence or absence of gene expression. 


Chromatin annotation of genome functions 

The genomic chromatin state annotation and discovery of refined 
chromatin signatures for chromosomes, domains, and subsets of regu- 
latory genes demonstrate the utility of a systematic, genome-wide pro- 
filing of an organism that is already understood in considerable detail. 
Clearly, the definition and functional annotation of chromatin patterns 
will be enhanced by incorporation of data for different types of com- 
ponents. Five ‘colours’ of chromatin were recently identified in Kc167 
cells using chromosomal protein maps*’. Comparison with our 9-state 
model shows similarities as well as differences in the ability to distin- 
guish functional elements (Supplementary Fig. 35); thus, further integ- 
ration of such data in the same cell type may resolve additional 
functional features. Our results illustrate the utility of integrating mul- 
tiple data types (histone marks, non-histone proteins, chromatin 
accessibility, short RNAs and transcriptional activity) for comprehens- 
ive characterization of functional chromatin states. 

An important, repeated theme is that chromatin state analysis 
identifies unexpected distinctions between subsets of active genes. 
Besides the differences linked to genomic context (for example, male 
X chromosome, heterochromatin), the main source of variability is 
the presence of the acetylation-rich state 3 (Fig. 6). Several lines of 
evidence suggest that the intronic positions marked by state 3 are 
important for gene regulation. State 3 regions show specific associa- 
tions with known chromatin remodellers (SPT16, dMi-2 and ISWI) 
and gene regulatory proteins (for example, GAF, dCBP/p300), and the 
highest rates of nucleosome turnover and transcription-dependent 
deposition of the H3.3 variant. State 3 genes are also bound by cohesin 
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DHS-relative position 
density of DHSs. d, Cell-line-specific DHSs are positioned predominantly 
within the enhancer-like state 3. The transition matrix shows the chromatin 
state of loci containing DHSs in one cell line (x axis), and the state of the same 
locus in the other cell line where the DHS is absent (y axis). Most of the DHSs 
that differ between cell lines originate from state 3. When DHSs are absent, the 
loci typically transition to an open chromatin state 4 (43%), or maintain state 3 
(23%). In both scenarios, most of the associated genes remain transcriptionally 
active (see Supplementary Fig. 34). e, Low levels of engaged RNA polymerase 
are associated with TSS-distal DHSs. The top plot shows the local increase in 
the antisense GRO-Seq signal for DHSs located within transcribed genes; 
dashed lines show median levels. Intergenic DHS positions (bottom plot) also 
show bi-directional GRO-Seq signal of comparable magnitude. See 
Supplementary Figs 27, 29 and 30. 


complex proteins, thought to associate with decondensed chromatin”' 
to promote looping interactions with promoter regions”. 

A regulatory role for state 3 chromatin is further suggested by the 
high density of DHSs, comparable to that of active TSS state 1, and the 
fact that state 3 accounts for most of the DHS plasticity among cell 
types. The combinations of histone marks found in state 3 are similar to 
signatures of mammalian enhancers”, which also show high variability 
between cell types*’. Furthermore, state 3 DHSs exhibit low levels of 
short, non-coding bidirectional transcripts reminiscent of eRNAs iden- 
tified in mice*’. Together, these findings suggest that state 3 regions 
contain enhancers or other regulatory elements, and that a combination 
of modifications can be used to identify new elements in the genome. 

Genes within repressive Polycomb domains also display several distinct 
combinatorial chromatin patterns (Fig. 4a), which probably represent a 
range of functional states: repressed, paused, or expressed genes in either 
balanced” or fully activated states. Alternatively, distinct signatures might 
mark subsets of regulatory genes that require either long-term repression 
or the ability to reverse functional states, depending on environmental or 
developmental cues. The PRE-proximal paused TSSs have some similarity 
to the ‘bivalent’ genes in mammalian cells, which also display transcrip- 
tional pausing of key regulatory and developmental genes*"””. However, 
the mammalian ‘bivalent state’ is characterized by the simultaneous 
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Figure 6 | Spatial arrangements of chromatin states associated with active 
transcription. Unlike short or exon-rich expressed genes, expressed genes with 
long intronic regions commonly contain one or more regions of enhancer-like 
state 3, associated with specific chromosomal proteins, high nucleosome 
turnover and DHSs displaying cell-type plasticity. 
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presence of PcG proteins, H3K27me3 and H3K4me3, which in 
Drosophila is found only in the fully elongating ‘balanced’ state”. 
Comprehensive analysis of chromatin signatures has enormous 
potential for annotating functional elements in both well studied 
and new genomes. Going forward, our systematic characterization 
of the epigenomic and transcriptional properties of Drosophila cells 
should spur in-depth experimental analyses of the relationship 
between chromatin states and genome functions, ranging from whole 
chromosomes down to individual regulatory elements and circuits. 


METHODS SUMMARY 


Histone modification and chromosomal protein antibodies were characterized 
for cross-reactivity. ChIP-chip was performed in duplicate, using Affymetrix 
Drosophila Tiling 2.0R Arrays. Digital DNase I-Seq assays were performed as 
described previously“, and Global Run-On library (GRO-Seq) data was generated 
as described previously*'. Short RNA data was generated by ref. 30, and RNA-Seq 
data was generated by ref. 45. See ref. 46 for other modENCODE RNA-Seq data. 
The chromatin state models were generated as hidden Markov models (HMMs) of 
different histone marks. DHSs were identified as read density peaks significantly 
enriched relative to the genomic DNA control. Clustering of chromatin signatures 
was determined using the PAM algorithm. 


Full Methods and any associated references are available in the online version of 
the paper at www.nature.com/nature. 
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METHODS 

Growth conditions. ML-DmBG3-c2 cells were obtained from DGRC (https:// 
dgrc.cgb.indiana.edu/), and S2-DRSC cells were from the DRSC (http:// 
www.flyrnai.org/). All cell lines were grown to a density of ~5 X 10° cells ml’ 
in Schneider's media (Gibco) supplemented with 10% FCS (HyClone). 10 pg 
ml ' insulin was added to the ML-DmBG3-c2 media. 

Antibodies. Antibodies are listed in Supplementary Table 1. Commercial 
antibodies against modified histones were tested by western blot for the lack of 
cross-reactivity with the corresponding recombinant histone produced in 
Escherichia coli and non-histone proteins from embryonic nuclear extracts. 
Antibody specificity was further assayed by western dot/slot blot against a panel 
of synthetic modified histone peptides. Only antibodies that showed <50% of 
total signal associated with non-histone proteins, and more than fivefold higher 
affinity for the corresponding histone peptide, were used in ChIP experiments. 

The specificity of antibodies against chromosomal proteins was tested by western 

blots with nuclear extracts prepared from mutant flies or $2 cells subjected to RNAi 
knockdown”. An antibody was considered specific if it recognized a major band of 
expected mobility that was absent in the sample prepared from mutant flies, or 
diminished twofold or more after RNAi depletion. When possible, distributions ofa 
chromosomal protein were mapped with two antibodies generated against different 
epitopes (see Supplementary Fig. 17). Data from chromatin proteins for which only 
one antibody was available were validated by comparison with published genomic 
distributions for a different component of the same complex, or to published 
genomic distributions generated with a different antibody. 
ChIP and microarray hybridization. Crosslinked chromatin from cultured cells 
was prepared as described”* with the following modifications. Before ultrasound 
shearing, cells were permeabilized with 1% SDS, and shearing was done in TE- 
PMSF (0.1% SDS, 10 mM Tris-HCl pH 8.0, 1mM EDTA pH8.0, 1mM PMSF) 
using a Bioruptor (Diagenode) (2 X 10 min, 1 X 5 min; 30s on, 30s off; high 
power setting). 

ChIP was performed as in ref. 28 and immunoprecipitated DNA was amplified 
using the whole genome amplification kit (WGA2, Sigma) according to the 
manufacturer’s instructions (chemical fragmentation step was omitted). The 
amplified material was labelled and hybridized to Drosophila Tiling Arrays 
v2.0 (Affymetrix) as in ref. 28. 

Processing of ChIP data. At least two independent biological replicates were 
assessed for each ChIP profile. The log, intensity ratios (M values) were calculated 
for each replicate. The profiles were smoothed using local regression (lowess) 
with 500 bp bandwidth, and the genome-wide mean was subtracted. The regions 
of significant enrichment were determined as clusters of at least 1 kb in length, 
with gaps no more than 100 bp where M value exceeds a statistically significant 
(0.1% false discovery rate (FDR)) enrichment threshold. The set of biological 
replicates was deemed consistent if the enriched regions from individual experi- 
ments had a 75% reciprocal overlap, or if at least 80% of the top 40% of the regions 
identified in each experiment were identified in the other replicate (before com- 
parison the replicates were size-equalized by increasing the significance threshold 
for a replicate with more enriched sequence). The data from individual replicates 
were then combined using local regression smoothing, and used for all of the 
presented analysis, unless noted otherwise. 

DNase I hypersensitivity. Digital DNase I-Seq assays were performed as described 
previously**. The sequenced reads were aligned to the Berkeley Drosophila 
Genome Project release 5 (BDGP.R5) genome assembly, recording only uniquely 
mappable reads. To detect DNase I hypersensitive sites, hotspot positions were 
identified based on a 300-bp scanning window statistic (Poisson model relative to 
50 kb background density, Z-score threshold of 2), and peaks of read density were 
selected within the hotspots using randomization-based thresholding at 0.1% FDR. 
The set of high-magnitude DHSs analysed here (except for Supplementary Fig. 23) 
was identified as a subset of all peaks that show statistically significant enrichment 
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over the normalized genomic DNA read density profile (using a 300-bp window 
centred around the peak, binomial model, with Z-score threshold of 3). This 
method controls for copy number variation and sequencing/mapping biases; 
however, it may also reduce the sensitivity of DHS detection. In the DHS chro- 
matin profile clustering analysis (Fig. 5a, relevant Supplementary figures), DHSs 
found within 1 kb of another DHS were excluded if their enrichment magnitude 
(relative to genomic background) was lower (to avoid showing the same region 
more than once). 
RNA sequencing. The preparation of RNA-Seq libraries and sequencing is 
described in ref. 45. The sequenced reads were aligned to the BDGP.R5 genome 
assembly and annotated exon junctions, recording only uniquely mappable reads. 
The RPKM (reads per kilobase of exonic sequence per million reads mapped) was 
estimated for each exon. The total transcriptional output of each annotated gene 
was estimated based on the maximum of all exons within the gene. The presented 
analysis uses logjo(RPKM-+ 1) values unless otherwise noted. 
GRO sequencing. Global Run-On library was prepared from S2 cells and 
sequenced as described?’ The reads were aligned to the BDGP.R5 genome assembly, 
recording only uniquely mappable reads. The smoothed profiles of reads mapping 
to each strand were calculated using Gaussian smoothing (o = 100 bp). The analysis 
uses log)9(d+1), where d is the smoothed density value. 
Short RNA data processing. The short RNA data for S2 cells was generated by 
ref. 30, and was aligned and processed in the same way as the GRO-Seq data. 
Chromatin state models. To derive a 9-state joint chromatin state model for $2 
and BG3 cells (Fig. 1a), the genome was first divided into 200-bp bins, and the 
average enrichment level was calculated within each bin based on unsmoothed 
log, intensity ratio values taking into account individual replicates, using all 
histone enrichment profiles and PC to discount the genome-wide difference in 
S2 H3K27me3 profiles. The bin-average values of each mark were shifted by the 
genome-wide mean, scaled by the genome-wide variance, and quantile-normalized 
between the two cells. The hidden Markov model (HMM) with multivariate normal 
emission distributions was then determined from the Baum-Welch algorithm 
using data from both cell types, and 30 seeding configurations determined with 
K-means clustering. States with minor intensity variations (Euclidian distance of 
mean emission values <0.15) were merged. Larger models (up to 30 states) were 
examined, and the final number of states was chosen for optimal interpretability. 
An extensive discrete chromatin state model (Supplementary Fig. 11) was 
calculated as described in ref. 20. The model was trained using a 200-bp grid 
with binary calls (enriched/not enriched). The binary calls were made based on a 
5% FDR threshold determined from ten genome-wide randomizations for each 
mark. For H1, H4 and H3K23ac regions of significant depletion rather than 
enrichment were called. 
Regions of enrichment for individual marks. To determine contiguous regions 
of enrichment for individual marks, a three-state HMM was used, with states 
corresponding to enriched, neutral and depleted profiles (normally-distributed 
emission parameters: (4 = [—0.5 0 0.5], o° = 0.3). The enriched regions were 
determined from the Viterbi path. The HMM segmentation was applied to 
unsmoothed M value data taking into account individual biological replicates. 
The genes were clustered based on the combinatorial pattern of occurrence of 
enriched regions (coding exons and state panels were not used for clustering). 
Classification of enrichment profiles. Clustering of chromatin signatures around 
TSSs (Fig. 4a), PREs (Fig. 4b) and DHSs (Fig. 5a and relevant Supplementary 
Information sections) was determined using the Partitioning Around Medoids 
algorithm. For clustering, each profile was summarized with average values within 
bins spanning +2-kb regions. One-hundred-base-pair bins were used for the 
central +500-bp region, 300-bp bins outside. 


47. Clemens, J. C. et al. Use of double-stranded RNA interference in Drosophila cell 
lines to dissect signal transduction pathways. Proc. Nat! Acad. Sci. USA 97, 
6499-6503 (2000). 
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it is complex and ever-changing. 
For many in the developed world, eating has become 

a leisure pursuit, and cooking a hobby. But our bodies are still 
hard-wired for a tougher world where food means survival. 
Our sense of taste, for example, evolved to be a front-line 
defence against toxins and a sensor to help detect the most 
energy-rich fare. However, our innate craving for sweets 
and fats now seems to be leading us down a path of bodily 
destruction. 

Food affects people differently. Current nutritional research 
involves looking beyond ingredients in an attempt to under- 
stand the effects of food at genetic and epigenetic levels. From 
the first milk meal we take, through feast and famine; our 
genes influence our diet, and nutrients — or lack of them — 
affect gene expression. 

Regional differences in food and culture have left their mark 
on our genome. Around the world, populations have adapted 
to their diet to make the most of local resources. In some 
instances, a foodstuff can protect against deadly infection, 
giving selective advantage to those who can readily digest it. 

Nutrition has also directed the evolution of our species. 
Only Homo sapiens and our extinct hominin cousins have 
used fire to manipulate raw food, thereby creating safer, easily 
digestible and tastier recipes. Combined with the use of tools 
and an omnivorous, wide-ranging appetite, the advent of 
cooking increased the energy yield for metabolism and fed 
our enlarging brains. 

Because food is packed full of complex, biologically active 
molecules, the fact it has an impact on our health is no 
surprise. Yet teasing apart the effects of each component on 
the body is a tall task, and one that will continue for many 
years to come. Some people predict an age of diets custom- 
ized to individual energy needs and disease susceptibility. But 
no matter how good the science is, or how well we are able to 
exploit food as an agent of healthfulness, we will still be eating 
for pleasure for some time yet. 

We are pleased to acknowledge the financial support of 
Nestlé Research Center in producing this Outlook. As always, 
Nature retains sole responsibility for all editorial content. 
Michelle Grayson 
Associate Editor, Nature Outlook. 
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Big science at the table 


Researchers are adopting the tools of bioinformatics and pharmaceuticals to study and 
interpret the ever-growing body of data on the interplay between diet and genes. 


BY LUCAS LAURSEN 


osé Ordovas sips a mint tea in a languid 
café in Madrid, Spain. His eyes scan two 
mobile phones as he confirms his next 
ppointments. In conversation, he switches 
effortlessly between Spanish and English to 
find the right expressions. If the geneticist 
seems to be moving on a different wavelength 
from the other patrons, he could blame it 
on the jet lag: he has just flown from Boston 
where it’s now 5am. This is his third oversees 
trip this month, but Ordovas contends his 
frequent visits from Tufts University, where 
he’s based, to Europe have no adverse effects. 
“For me the time difference doesn’t matter, ’'m 
up at 4am to make calls to Europe when I’m 
home anyway, and then I’m up late on calls to 
California,” he says. 
Ordovas embodies the hustle and bustle of 
the ‘big science’ approach that has changed 


nutrition research in the past decade. This field, 
once confined to small groups of researchers 
studying the effects of single nutrients — such as 
particular vitamins or proteins — ona few 
dozen volunteers, is now adopting the heavy- 
lifting tools developed for genetics and phar- 
maceutical research. It also has a catchy name: 
nutrigenomics. And the more that researchers 
learn how our genes interact with our diet, 
the more they appreciate the deeper insight 
gained by an interdisciplinary approach. Such 
knowledge could lead to breakthroughs in our 
understanding of risk factors for diabetes and 
cardiovascular disease (see Edible advice, 
page S10) or, for example, improve the design 
of weight-loss diets. 

Nutrigenomics is starting to reveal that a 
person's diet is more than the number of calories 
they eat or the ratio of proteins to carbohydrates 
or fats. Those are important, but the analogy of 
human metabolism as a car engine that requires 
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acertain type and amount of fuel does not hold 
up in the age of whole-genome analysis. Nutri- 
tion researchers are realizing that our diet does 
more than just fire our pistons. It is as if the fuel 
we consume can reach out from the combustion 
chambers in the engine — through the genetic 
pathways that govern our metabolism — and 
tune the engine mid-race. 

Multiply those fine adjustments by every 
possible mutation in each gene of the human 
genome, perhaps 10 million tweaks in total, and 
you have an idea of the scale of Ordovas’ task. 
“The only way to realize this concept is via big 
science,’ he says. 

Ordovas studies how food influences 
cholesterol and other 
cardiovascular health 
indicators in large groups 
of people. “You take large 
numbers of individuals 
with a well-characterized 


NATURE.COM 
for more on how big 
science is shaping 
nutrigenomics 
go.nature.com/kqZpjV 


ILLUSTRATIONS BY DAVID PARKINS 


diet,” says his collaborator John Math- 
ers, a nutrition scientist at the University 
of Newcastle, UK, “and you do a genomic 
study to ask the question: how does diet 
interact with the genome to produce a par- 
ticular phenotype?” Cardiovascular health 
might be nutrigenomics’ strongest applica- 
tion to date, Mathers says, but researchers are 
also beginning to study the long-term effects of 
nutrition on the brain and on ageing. 


ADOPTING AND ADAPTING 

Answering these questions requires examin- 
ing how small genetic mutations, called single 
nucleotide polymorphisms (SNPs), affect the 
production of enzymes and hormones that 
control metabolism. There are thousands of 
these mutations in each individual and count- 
less feedback loops, meaning that researchers 
in the emerging area of metabolomics must 
employ sophisticated bioinformatics models. “A 
lot of those tools were developed for pharma- 
ceutical studies and now have become almost 
commonplace in all areas of biology, including 
nutrition,” says Mathers. 

Progress in pharmaceutical research has 
stimulated improvements in microarrays, 
high-throughput sequencing, polymorphism 
identification and DNA methylation tech- 
nologies, used to scan for novel receptors 
that might respond to potential drug mol- 
ecules, says bioinformaticist Chris Evelo of 
Maastricht University in the Netherlands. In 
large clinical trials, researchers often col- 
lect information about multiple levels of 
an individual's health before and after the 
trial in case a drug targeting the heart has an 
unanticipated effect on the liver, for instance. 
Likewise, nutrigenomics specialists are 
concerned with the broader effects of any 
experimental dietary intervention. “This 


system-wide approach has been the rule 
in nutrigenomics research all along. There 
often are no clear target genes for diet changes,” 
says Evelos. This problem forces research- 
ers to seek out subtle interactions among 
many elements of the metabolic system and 
related genes. 

In addition to epidemiological studies, which 
examine global populations without interfer- 
ing with anybody’s diet, many researchers in 

nutrigenomics are 


Nutrition employing interven- 
research has tion studies, which 
experienced are more like the 
the move clinical trials used 
towards more by drug and medi- 


cal device makers. 
“In this other type 
of study you delib- 
erately modify the 
nutritional exposure 
of cells, animals or people,” explains Math- 
ers, “and then measure the expression of 
genes using whole genome expression arrays 
to try to understand how altered nutritional 
exposure regulates gene expression and, 
ultimately, phenotype.” 

Nutrition researcher Lynnette Ferguson at 
the University of Auckland in New Zealand has 
experienced the move towards more pharma- 
like genome-wide intervention studies. She 
notes that, as recently as 2003, she and her 
colleagues were “talking about single genes, 
single nutrients.” Yet many promising treat- 
ments based on single molecules had clear 
effects in the lab but never passed animal 
trials. This is because, as Evelo says, “if you 
push the system in one place it will compen- 
sate through another mechanism and in the 
end the wished for effect does not occur.” Since 
then, rapid improvements in microarrays and 


pharma-like 
genome-wide 
intervention 
studies. 


MEETING OF MINDS 


New conferences catering for nutrigenomics 


4th Asia Pacific Nutrigenomics 
Conference 

21-25 February 2010, Auckland, New 
Zealand 

Exploring the theme of gut health as 
influenced by both genetics and the 
microbiota. Around 200 people attended 
from 19 countries. 


7th NuGO Week 

31 August - 3 September 2010, Glasgow, UK 
An overarching theme of metabolic health, 
with sessions on biomarkers, modelling 
tools and personalized nutrition. Around 
130 people attended. 


1st International Conference on 
Nutrigenomics 


26-29 September 2010, Sao Paulo, Brazil 
Discussions centred on the interaction 
between diet and genes, and how this 
enables personalized health and disease 
prevention, particularly in Latin America. 


1st Global HealthShare Initiative Workshop 
18-20 October 2010, Davis, California 

An invitation-only event that jointly 
addressed issues of nutrition and immunity 
in the developing world. 


4th Congress of the International Society 
of Nutrigenetics/Nutrigenomics 

17-20 November 2010, Pamplona, Spain 
Reviewing developments in the related 
fields of nutrigenomics, nutrigenetics and 
nutriepigenomics, in disease prevention. 


NUTRIGENOMICS pReleyymelele 


‘deep sequencing’ technologies have ena- 
bled researchers to consider the impact of 
food down to the level of individual SNPs. It 
has also given them a more objective tool to 
measure what volunteers are actually eating, 
rather than relying on self-reporting. 

Adopting technology from outside tra- 
ditional nutrition science means adopting 
new research methods. “My own advantage 
was that I had been part ofa cancer research 
programme,’ says Ferguson. “I’ve watched 
the development of pharmaceuticals, seen 
my colleagues work with them and seen the 
sorts of models they use.” Ferguson’s team used 
high-throughput sequencing to screen 
human cells for modifications to the inter- 
leukin-12/23 receptor pathway — important 
for bowelhealth — that they suspected were 
caused by certain foods. This work helped 
them develop a cellular assay for measuring 
the effect of particular food components on 
gene expression in human cells. The next step 
is to validate whether such nutrient-genome 
interactions exist in animal models, before 
planning human trials, just as if they were test- 
ing a new drug. 


GENETIC PROFILING 

These tests will not be straightforward as not 
all people respond to dietary changes in the 
same way that not all people react to a particu- 
lar medicine. Identifying different populations 
based on their genetic responsiveness is starting 
to show promise, according to Ordovas. In the 
best case scenario, researchers would screen 
individuals against panels of genetic risk factors. 
In the case of cholesterol, Ordovas and col- 
leagues have found specific genetic differences 
between people whose cholesterol levels are 
affected by changing their diet and those who 
only respond to medication. Right now, doctors 
try patients on multiple diets before prescrib- 
ing cholesterol-reducing drugs to avoid side 
effects. But with a reliable genetic screening 
test, doctors could prescribe drugs to patients 
unlikely to respond to dietary changes, saving 
time and helping reduce the harm caused by 
living with elevated cholesterol levels. 

The majority of dietary effects are subtle, 
however, and certain genetic profiles might 
be relatively rare and more difficult to screen. 
This requires large cohorts to detect and iden- 
tify signals. “In any gene, there are a few key 
polymorphisms that we scan, but others will 
be less common, may not be on the chip we 
use, or in the specific ethnic group that we are 
studying, but could still cause disease,” says 
Ordovas. That means he may need to scan 
ever-larger numbers of volunteers — perhaps 
into the hundreds of thousands. Unlocking 
the massive datasets that will emerge will, of 
course, require dozens of researchers — out- 
numbering the volunteers that participated in 
Ordovas’ studies in the 1980s (a fact he men- 
tions when he gives presentations about this 
burgeoning field). 
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On top of the new mentality and tools, 
any new scientific discipline needs a way to 
share data. Through a collaboration called 
the European Nutrigenomics Organisation 
(NuGO), Ben van Ommen at the Netherlands 
Organisation for Applied Scientific Research 
(TNO) recruits contributors to the Nutritional 
Phenotype Database (dbNP). Its goal is to 
combine data from many different areas of biol- 
ogy, including genetics, transcription, pro- 
tein production, metabolism and behavioural 
data. “The European Bioinformatics Institute 
made Array Express and the US National 
Center for Biotechnology Information has 
made Gene Expression Omnibus and they 
store transcriptome data,’ says van Ommen. 
“That’s good but it’s not good enough for us. 
Nobody does just a transcriptome study or just 
ametabolomics experiment — everybody does 
it all together” 

Ordovas agrees: “When I began studying 
lipids I only looked at the biochemistry. We 
all used to be like rhinoceros poachers who 
took the horn and left the carcass, but now we 
have more tools and collaborators and every- 
one extracts information from all the data in 
astudy.” 


MAKING TEAMWORK PAY OFF 

This type of ‘extensive phenotyping, quantify- 
ing all relevant parameters, is already paying 
off. A NuGO study led by Gertruud Bakker 
found that an experimental anti-inflamma- 
tory diet in 36 healthy but overweight men 
increased the concentration of adiponectin, 
an anti-inflammatory protein, in the blood- 
stream. By monitoring hundreds of other 
metabolism-related proteins and metabolites of 


blood cells and adipose tissue, the team iden- 
tified more than 500 other diet-driven 
changes. These included improving the ratio 
of omega-3 to omega-6 anti-inflammatory 
precursors in blood plasma and lowering lev- 
els of oxidative stress-causing prostaglandin 
in urine. If the team had used only single- 

metabolite methods, 


Selecting van Ommen says, 
collaborators at they “would only have 
first was “like detected an effect on 
early dating adiponectin.” 

situations”. Adapting pharma- 


ceutical technologies 
to food isn't the only challenge for researchers: 
“{t's also how you deal with all that data,” says 
Evolos. Some computer models aim to describe 
observations whereas others try to repli- 
cate or predict. “We are trying to integrate 
those two approaches,’ says Evelo. This could 
help researchers working on different facets 
of the same problem to better understand 
one another's results, forge new collaborations, 
and help trace biological problems from the 
point where food molecules interact with the 
transcriptome to the symptoms that are pre- 
sented in a doctor’s examination room. 

As a proof of principle, the NuGO team 
used dbNP to track the development of human- 
like insulin resistance. Evelo and colleagues 
fed mice a high-fat diet and performed genome- 
wide transcriptome analysis, tissue sampling, 
plasma sampling and proteome analysis. 
They observed that the first signs emerged in 
a type of fat tissue. This finding neatly explains 
previous studies that suggested the ratio of 
saturated to unsaturated fatty acids 
affects whether a person develops insulin 
resistance. 

In addition to the database, there 

are a slew of new meetings (see 

Meeting of minds, page S4). Fergu- 

son established an annual retreat 

to help New Zealand’s nutrition 

and genomics researchers, from 

academia and industry, find common 

ground. “I feel that the slight tension 

between different priorities [in these 

groups] has actually been a benefit,” says 

Ferguson. One resulting food developed from 

genetic research on Crohn's disease is a bread 
less likely to inflame an irritable bowel. 

Nutrigenomics researchers also make the 
most of social networking to stay in touch. 
One researcher uses the Twitter handle 
@nutrigenomics; Ordovas and Jim Kaput, head 
of the FDA’s personalized medicine division, 
often make Skype calls during the weekend. 
If this side of big science sounds a bit like cul- 
tivating a long-distance relationship — it is, 
says Ferguson. Selecting collaborators at first 
was “like early dating situations: did we want 
to work together? Did we want to work with 
other partners?” Now that funding is avail- 
able, there are many more people express- 
ing an interest. Ferguson and collaborators 


S4 | NATURE | VOL 468 | 23/30 DECEMBER 2010 


must now ask the hard questions of ‘what’s 
your skill set?’ and ‘what can you contribute?’ 
before inviting would-be partners on board. 

The near-term future of nutrigenomics is 
almost certain: researchers are already hus- 
tling to persuade government and funding 
bodies to finance follow-up studies on the lat- 
est research by asking the same questions but 
on amore ambitious scale — testing hypoth- 
eses derived from cell cultures in animals and 
humans. 


LONGER TO WAIT 

Some researchers question how useful indi- 
vidual nutrition advice will be in the near 
term. “Personalized nutrition advice may 
not be helpful to the general public if they 
don’t know their own genetics,” says Albert 
Koulman, an analytical chemist at the Medi- 
cal Research Council in Cambridge, UK. But 
consumer genomic analysis provokes more 
questions, such as who pays, who gets the 
results and whether it affects health insurance 
rates. “There's much more than just the biol- 
ogy, there’s the business side and the ethics. 
We're still just scouting scenarios,” says van 
Ommen. 

Commercial pet food today may be a pre- 
view of the kind of food categories humans 
might find in future markets, according to 
Kenneth Kornman, head of InterLeukin 
Genetics. “Pet foods I get for my dog are age- 
categorized, or categorized by sensitivities such 
as gastrointestinal problems,” he says. Dietary 
needs for individuals also change over the 
course of their lifetime and from one group of 
people to the next. 

Food manufacturers could one day offer the 
same choices pet food makers do today — with 
the additional cost of ensuring that the food 
is safe for human consumption. There is a big 
cost to launching sucha food, notes Kornman. 
“Youd need to have a reasonable idea that you'll 
earn it back” Yet few companies know how to 
market genetically customized nutrition to 
customers or how to successfully patent a diet 
consisting of widely available foods, he says. 

Instead, nutrigenomics researchers face 
the challenge of identifying and measuring a 
much more subtle state than disease: health. 
“Optimal health is much more than the absence 
of disease,’ says van Ommen, “so we need a dif- 
ferent set of biomarkers, not of disease, but of 
health” 

Measuring that will require understanding 
more than just the chemistry of our food or the 
on-off switches of our genes. “We've started to 
better appreciate the fact that it’s not just the diet 
and it’s not just the genetic factors but it is an 
interaction of the two that permits a metabolic 
change that gets translated in a complex disease 
over time,” says Kornman. It may be a tricky 
tune to follow, but nutrigenomics researchers 
are all ears. m 


Lucas Laursen is a journalist based in Madrid. 


|. HOOTON/SCIENCE PHOTO LIBRARY 


| DEVELOPMENT | 


Mother’s milk: A 


rich opportunity 


Research on the contents of milk and how breast-feeding 
benefits a growing child is surprising scientists. 


BY ANNA PETHERICK 


for a tammar wallaby, you will face acom- 

plicated recipe. During its 300 days in the 
pouch between birth and weaning, the baby 
wallaby, or joey, drinks different milk almost 
ona weekly basis. 

Early on, the joey needs colostrum, which is 
packed with antibodies. After 60 days, the for- 
mula should be rich in asparagine-containing 
peptides, which are thought to help brain devel- 
opment. Ninety days later, the baby wallaby will 


S hould you ever need to mix formula milk 


need a dose of sulphur-containing amino acids, 
such as cysteine and methionine, which will 
cause hair follicles and nails to grow. 

For healthy development, the number of 
calories contained in the milk must also rise, 
such that the joey is weaned from milk that 
is four and a half times as energy rich as 
the liquid it first drank. This compositional 
sequence appears to be entirely dictated by the 
mother’s body. And, bizarrely, her teats can 
function independently, with each baby wallaby 
effectively stuck to one teat for the first 200 days 
in the pouch. In fact, a joey may have a pouch 
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| Human milk recipe 
for newborns 


Colostrum 


Animmunological delight! 


/ High in short-chain HMOs* 
¥ High in immunomodulatory IL-10 
¥ Low in fat 


¥ Low in caseins 


Key ingredients 

Whey protein, immunoglobins 
(particularly Iga), lactoferrin, 
vitamins A and E, carotenoids and 
cytokines (especially IL-1, IL-6 
and TNF-q) 


“Human milk oligosaccharides 


mate ofa different age that feeds on an adjacent 
teat — and which receives milk of a different 
composition, appropriate for its age’. 

Unlike a wallaby’s, human milk does not 
change so radically over time because the devel- 
opmental signals, which wallabies transfer in 
milk, can be delivered through the human pla- 
centa. The major constituents of human milk 
— the fat, protein and carbohydrate — vary 
little over the course of lactation. But a closer 
analysis reveals important time-dependent 
variation in the complement of bioactive 
ingredients in human milk — the molecules 
and cells that have biological functions beyond 
fuelling metabolism and providing the raw 
materials for infant growth. Finding what these 
ingredients are and what they do drives much 
of today’s lactation research. 


A MAMMALIAN MIXTURE 

Until recently, the study of human lactation 
was conceived mainly from the perspective of 
public health. Now the trend is to approach the 
subject from an evolutionary standpoint. This 
perspective presumes that an infant should 
breastfeed as much as possible to maximize its 
chances of survival, whereas a mother must 
balance her current metabolic investment in 
milk production with her potential investment 
in future offspring. 

For example, evolutionary theory suggests 
that mothers should invest more in feeding 
sons because a successful son can produce 
many more offspring than a daughter. Sev- 
eral recent studies support this view by iden- 
tifying clear differences in the breast milk 
consumed by males and females. In humans, 
for example, baby boys receive milk that has 
substantially more fat and protein than the 

milk girls get’. 


> NATURE.COM In rhesus macaques, 
more about the sons drink milk with 
science behind a higher concentra- 
breastfeeding tion of cortisol, a hor- 
go.nature.com/95CsV7_ +=monethat modulates > 
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> metabolism, even though their mothers have 
no more cortisol circulating in their blood than 
when nursing a daughter. It is unclear whether 
this cortisol-related sex difference has a func- 
tion. But there are clues: young male macaques 
that consume milk containing high levels of 
the hormone develop bold behaviour, whereas 
cortisol in milk appears to have no influence 
on female macaque infants’. Whether this has 
a parallel in humans is yet to be determined. 

A second major shift in human lactation 
research entails the incorporation of new tools 
to answer traditional questions — such as com- 
paring the effects of breast and formula feeding 
— and to grapple with evolutionary and func- 
tional issues. Human milk is dilute compared 
to the milk of other placental mammals, but 
it does contain some surprising ingredients. 
Advances in high-throughput mass spectrom- 
etry, for example, have revealed the existence 
of more than 200 human milk oligosaccharides 
(HMOs). Calito Lebrilla, an analytical chemist 
at the University of California, Davis, has found 
that mothers seem to produce individual com- 
plements of about 100 HMOs — but no one 
has figured out why different mothers produce 
different sets of HMOs, or even ifit is the same 
complement of HMOs for each child. 

Although they are carbohydrates, HMOs do 
not appear to nourish infants. Instead they feed 
certain gut bacteria, giving them a competitive 
edge over other species. “When a child is born 
its gut is rapidly populated by pathogenic bac- 
teria,” says Lebrilla. “However as the child is fed 
human milk the population changes to beneficial 


Early milk 


Especially tasty for friendly bacteria! 


/ High in lactose 
/ High in casein 
/ High in fat 


Method 
Start with basic colostrum recipe. 
Decrease content of short-chain 

HMOs‘, vitamins, carotenoids, 
whey protein, TNF-a and IL-10 


*Human milk oligosaccharid 


species.” Bifidobacterium infantis, which pro- 
tects against diarrhoea, is particularly efficient 
at metabolizing the small-mass HMOs that 
are abundant in early lactation’. So breast milk 
gives B. infantis an advantage over other species 
in establishing a gut population. “The mother is 
therefore ‘selecting’ specific bacteria to grow in 
the infant’s gut by her HMOs,’ says Lebrilla. 

Furthermore, some HMOs can inhibit harm- 
ful bacteria and viruses directly. For example, 
certain HMOs block the binding of Campylo- 
bacter jejuni, the most common cause of bacte- 
rial diarrhoea, to intestinal mucosa, and thereby 
inhibiting pathogenesis*. 


BRAINY BABIES 


Human milk also delivers some microbes 
directly to the gut. Breast milk is laced with 
several species of lactic acid bacteria from the 
mother’s intestine that are thought to travel 
to her mammary glands inside white blood 
cells. Most of these species inhibit pathogenic 
bacteria by secreting hydrogen peroxide and 
compounds called bacteriocins. 

The past decade has seen a large extension 
in the list of immunological factors detected 
in human milk. Breast milk was long thought 
to provide only passive immunity to infants, 
through maternal antibodies in the form of 
secretory immunoglobulin A. However, the 
newly identified crop of immune-regulatory 
proteins could be prompting and modulating 
development of the infant’s own immune sys- 
tem. Of particular interest are cytokines, which 
orchestrate the immune system by signalling 
between its cells. 

There is even evidence that breast milk 
influences gene expression in infant gut cells. 
In a pilot study, Sharon Donovan, a paediat- 
ric nutritionist at the University of Illinois, 
and Robert Chapkin, a biochemist at Texas 
A&M University, extracted RNA from exfoli- 
ated intestinal cells from several 3-month-old 
infants. They assessed the statistical differ- 
ence in RNA expression between breast- and 
formula-fed infants. Several of the genes that 
varied were identified as putative master 
genes, which control the expression of other 
genes. Most of these genes encode transcrip- 
tion factors associated with angiogenesis and 
wound repair — including EPAS1, a gene that 


Does breast milk make you smarter? 


Between late 2002 and the spring of 2005, 
13,889 Belarusian children of about six years 
of age took an IQ test and had their reading 
and writing skills evaluated by teachers. The 
mothers of about half of them had been 
encouraged to breastfeed under a World 
Health Organization (WHO) programme 
called the Baby-Friendly Hospital Initiative. 
As a result, these mothers were seven times 
more likely to have exclusively breastfed until 
their child was 3-months old. 

Results of this study, called Promotion of 
Breastfeeding Intervention Trial (PROBIT), 
showed that the 6-year-olds whose mothers 
were part of the WHO initiative had better 
academic ratings from their teachers and 
IQ scores on average 5.9 points higher?®. 
“PROBIT found lots of health benefits in 
the first year of life,” says Michael Kramer, 
an epidemiologist at McGill University in 
Montreal, Canada, “but over the longer term 
the only difference was cognitive ability.” 

No one is quite sure what causes this 
intelligence boost. But one 2007 study by 


Tests point to higher IQ in breast-fed children. 


Duke University psychologist Avshalom 
Caspi has identified a candidate: a gene 
that appears to mediate the effects of 
human milk on brain development?! 
Caspi and colleagues trawled the Kyoto 
Encyclopaedia of Genes and Genomes 
(KEGG) database for genes involved in the 
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metabolism of long-chain polyunsaturated 
fatty acids. These acids are linked to several 
aspects of neuron development. Two such 
fats — docosahexaenoic acid (DHA) and 
arachidonic acid (AA) — are present in 
human breast milk, but not in cows’ milk or 
most infant formulas. 

The KEGG search identified a gene on 
chromosome 11, called FADS2, which is 
both regulated by dietary AA and DHA and 
also encodes an enzyme that catalyses 
metabolism of these two acids. One specific 
variant of the FADS2 gene was present in 
more than 90% of the cohort in the study. 
Researchers found that only the breastfed 
babies who had this specific FADS2 variant 
exhibited an IQ advantage. The research 
implies that fatty acid metabolism could 
be part of the missing link between 
breastfeeding and IQ. This FADS2 variant 
was estimated to account for a difference of 
4.1 1Q points, which goes a long way towards 
explaining the 5.9 IQ points difference found 
in the PROBIT trial. 


T. HALL/PHOTOLIBRARY.COM 


is transcribed three times as much in the gut 
cells of breastfed infants®. 

Donovan and Chapkin’s study is the first evi- 
dence that breast milk — rich in natural bacteria 
— affects infant gene expression, and Donovan 
cautions about over-interpreting their findings. 
This is, however, likely to be an expanding area 
of research as probiotics become more com- 
monly used as ingredients in formula milk. “We 
have no idea how these are potentially affecting 
gene expression,’ says Donovan. 

Over the years, the ‘breast versus formula 
debate has become polarized, and several 
researchers contacted for this article com- 
plained that either breastfeeding advocacy 
groups or formula companies had exaggerated 
their findings in the past. Donovan's recent gene 
expression study was sponsored by a formula 
milk manufacturer, but she is applying for US 
National Institutes of Health funding for fur- 
ther studies to avoid the criticism that comes 
with being commercially funded. 


HEALTH CONTROVERSIES 
Researchers have tried to disentangle the 
effects of feeding an infant formula rather 
than breast milk. “The vast majority of studies 
tend to gravitate towards breast milk as bet- 
ter, rather than equal, but the evidence varies 
in quality,’ says Jonathan Wells, who studies 
human ecology at University College London. 
“Many of the accepted benefits of human milk 
relate to avoiding pathogens.” And while these 
pathogens might be less dangerous to a baby 
in a more medically advanced society than 
in a developing one, breast milk still offers 
advantages to all infants. Breastfeeding has 
consistently been found 
to protect against necro- 


WHAT'S IN HUMAN MILK 


Human milk oligosaccharides (HMOs) are food for friendly bacteria like Bifidobacterium 
infantis. Shorter chain HMOs in particular are almost entirely consumed by this microbe. 


Milk 


HMOs 
Proteins 


Macro- and 
micronutrients 


Lipids 


Lactose 


There was also no difference between these 
groups in the prevalence of asthma or allergies’. 
PROBIT did, however, show an intriguing link 
between breastfeeding and intelligence (see 
Brainy babies, page S6). 

The breastfeeding-IQ association had been 
reported before, but what made PROBIT’s 
results important was the size of its dataset. 
It is critical to have a very large sample size 
in order to eliminate confounding factors. 
Qualities such as obesity and IQ often vary 
across rich countries in similar patterns to the 
tendency of mothers to breastfeed. In devel- 
oped countries, wealthier women are more 
likely to breast-feed — but they are also gen- 
erally slimmer, better 
educated and spend 


Macro-/micronutrients 


tizing enterocolitis (in 
which portions of the 
bowel tissue die) in 
pre-term infants, and 
against diarrhoea and 
ear infections in full- 
term infants. 

Impacts on health 


|} Humanmilk recipe ee 
for newborns 
Mature milk 


Nourishment for growth! 


¥ More milk 
v Less fat 


¥ Less whey protein 


more time talking to 
their babies. 

It might be that 
certain ingredients 
in formula milk are 
responsible for later 
weight issues. Results 
from the European 


in later life stand out 


Childhood Obesity 


less clearly in the data, 
although associations 
between formula feeding 
and type 2 diabetes and 
inflammatory bowel dis- 
ease have been observed. 
Some meta-analyses 
report that breastfeed- 
ing reduces the chance 
that a child will be obese 
at school age by about 
20%. But these results are not conclusive. The 
largest breastfeeding trial, Promotion of Breast- 
feeding Intervention Trial (PROBIT), found no 
difference in the plumpness of two groups of 
six-and-a-half-year-old Belarusian children, 
where one group had been breastfed for much 
longer before the introduction of formula milk’. 


Method 


reduced 


HMO* content should be at half 
initial level. IL-1B, IL-6, vitamins 
and carotenoids also substantially 


Maintain concentration of lactose, 
IFN-y, and casein 


“Human milk oligosaccharides 


Project supports the 
‘early protein hypoth- 
esis, which holds that 
higher levels of pro- 
tein found in stand- 
ard infant formulas 
programme the body 
to become fatter in 
later years’. The project 
randomized 1,000 
European infants to 
receive either formula of high-protein con- 
centration (standard formula), formula of 
low-protein concentration (similar to the 
protein content of human milk), or breast milk. 
The result: unlike the high-protein group, the 
low-protein group grew no tubbier than the 
breastfed control group. 
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Proportion 
Chain eaten by 
length B. infantis 
HMOs 


oo oeGEEGeEe 


Other HMOs 
of longer 


lengths 


The diverse ingredients of an infant's first 
meal have an impact on its development, 
and no matter how much we tinker with the 
composition of formula milk it will always 
lack many of the trace constituents of human 
milk. As research identifies these substances, 
it increasingly seems they serve a role beyond 
direct nutritional benefit: that of communicat- 
ing information to the infant about the environ- 
ment and even the social structure around the 
mother, which affects the richness of her diet 
and her level of physical activity and therefore 
also affects her milk. 

Wells believes that very young humans 
should be thought of as having to adapt to the 
mother’s surroundings, rather than to the wider 
world. Indeed, the fact there are so many bioac- 
tive molecules in breast milk means that breast- 
feeding is an activity that empowers mothers. 
He adds, “The more we learn about the details 
of breast milk the more we realize that males 
havea little chance to influence their offspring 
by non-genetic pathways. Mothers have a very 
rich opportunity.” m 


Anna Petherick is a journalist in Buenos Aires. 
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| EVOLUTION | 


The first supper 


Diet-directed evolution shaped our brains, but whether it was meat or tubers, or their 
preparation, that spurred our divergence from other primates remains a matter of hot debate. 


BY MICHAEL EISENSTEIN 


iven the millions of years since our 
eC parted ways, it’s unsurpris- 

ing that a chimpanzee’s idea of a good 
meal differs from our own. “When I visited 
our study site in Uganda, I followed a chimp 
in the forest for a day and tried to eat every- 
thing it ate,” recalls Svante Paabo, an evolu- 
tionary geneticist at the Max Planck Institute 
in Leipzig, Germany. “It’s too disgusting and 
not digestible — you can't really do it” 

Part of the reason is genetics. In 2008, Paabo 
and colleagues found evidence for accelerated 
evolution of both the regulatory and coding 
sequences of diet-related genes shared by chim- 
panzees and humans’. Many anthropologists 
now believe that radical changes in diet may 
have been a major driver of hominin evolution 
and possibly even the primary factor that pro- 
pelled our genus Homo forward by enabling us 
to survive and thrive. 

One evolutionary milestone was encephali- 
zation: an enlargement of the brain estimated 
to have begun roughly 1.8 million years ago 
when Homo habilis transitioned to Homo 
erectus. What powered this growth spurt 
remains a subject of ongoing debate. 


MEAT AND POTATOES 

A big brain is a huge investment in metabolic 
terms. One model advanced in the mid-1990s, 
the expensive tissue hypothesis, suggests our 


ancestors settled that bill by gaining access to 
more nutrient-rich diets, which spurred brain 
growth while reducing gut size. Scientists 
have suggested that the wealth of vitamins, 
proteins and fats in meat was a major boon 
and there is evidence our ancestors used 
stone tools to carve up their food as early as 
2.5 million years ago. An article published in 
Nature this year reported the find of 3.4 mil- 
lion year-oldfossil bones scarred by cutting 
tools, pushing the date back further still to 
australopithecines. 

“There's fairly decent evidence that meat was 
likely a piece of the diet of australopithecines,” 
says Josh Snodgrass, an anthropologist at the 
University of Oregon, “but they were prob- 
ably eating diets that were much more plant- 
based.” Given the richness of nutrients in meat, 
Snodgrass believes that even minor changes 
would have had a big impact on caloric intake 
and contends that use of more sophisticated 
tools may have increased consumption of meat 
in early hominins. “Access to high-quality ani- 
mal foods was probably at least one of the major 
driving factors in allowing [encephalization] to 
happen,” he says. 

On the other hand, the pursuit of a steak 
dinner is not without hazards, according to 
David Braun, an archaeologist at the Univer- 
sity of Cape Town in South Africa. “There are 
multiple consequences of making that shift,” he 
says. “There are costs of predator-prey inter- 
action, of entering into a niche that hominins 
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aren't necessarily all that well-adapted to, and 
all kinds of parasitological costs.” 

Dartmouth College anthropologist 
Nathaniel Dominy favours the view that our 
ancestors might have put their tools to bet- 
ter use in unearthing root vegetables. He has 
observed how modern hunter-gatherers sur- 
vive in an African savannah-like environment 
that may not be radically dissimilar from where 
H. erectus flourished. He suggests that tubers 
offered an essential buffer against the vicissi- 
tudes of the hunter lifestyle. “Modern hunter- 
gatherers have language, technology and 
iron-tipped spears, yet they still struggle to get 
enough meat to survive,’ he says. “It’s hard to 
imagine a bunch of hominins without those 
accoutrements getting a lot of meat.” Tubers 
were abundant and may have provided the 
staple nutrients needed to make brain growth 
adaptive when easy access to meat was no sure 
thing. 

However, efficient tuber digestion depends 
on another major technological advance — 
cooking. “Most tubers absolutely require 
roasting,’ says Dominy. Harvard University 
anthropologist Richard Wrangham believes 
this is not a problem. In 1999, he published a 

controversial article pro- 
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HOMININ COOKBOOK 


Evolution of our diets and food preparation techniques. 
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Consumption of aquatic animals, 
mainly catfish (South Africa). 


million years ago. Wrangham has since devel- 
oped this concept to explain how our ancestors 
maximized the nutritional benefits of tubers, 
meat and other foodstuffs. “It has not been 
appreciated by most people until recently that 
cooking has a large effect on net energy gain,” 
says Wrangham. “Normally it’s considered nec- 
essary because it enlarges the possible diet and 
makes food safer, but energy is sucha key vari- 
able for evolutionary adaptation.” 

Preliminary analyses by Wrangham and 
colleagues suggest that cooking may have 
made proteins and starches more digestible 
while simultaneously reducing the cost to the 
immune system of fending off parasites or bac- 
terial infection. 


THE HARD FACTS 

Many anthropologists remain wary of the evi- 
dence gap in Wrangham’s hypothesis. The ear- 
liest sign of controlled fire comes from Israel, 
dating back some 800,000 years — consider- 
ably shorter than 2 million years. Neverthe- 
less, Braun is hesitant to rule out Wrangham’s 
theory, pointing out that remains of cooking 
fires can be ephemeral. The evidence found at 
the Israeli site is particularly unusual. “Gesher 
Benot Yaaqov is the kind of place archaeolo- 
gists dream of,” he says. “Wood is preserved 
there, as are all kinds of activities that aren't 
preserved elsewhere.” 

Braun has encountered similar challenges: a 
recent study by his team at a 1.95 million year 
old site in Turkana, Kenya, found remains of 
bones and stone tools indicating that prede- 
cessors of H. erectus may have routinely eaten 
fish and other marine life’. If this represents 
a true dietary pattern, then ‘brain food’ may 
have lived up to its name by providing an abun- 
dant source of the polyunsaturated fatty acids 
that fuel the growth of the cerebral cortex. 


Nevertheless, an early role for aquatic animals 
in the hominin diet remains controversial as 
archaeological evidence points to seafood only 
becoming a regular item on the menu between 
150,000 and 200,000 years ago. This could be 
explained by the challenges of actually finding 
evidence of these foods being prepared. “The 
preservation that happened at that particular 
site, I think, is unusually good,” says Braun. 
“We usually use marks on bone surfaces as a 
determining factor of whether something is 
part of the diet [and] those don't preserve really 
well for aquatic animals.” 

Unfortunately, any efforts to link food choice 
to human evolution will continue to depend on 
what can be unearthed at such sites: evidence 
from the genetic record is likely to be harder to 
find. Paabo and colleagues assembled a draft of 
the Neanderthal genome’. This offers a wealth 
of information on human evolution over the 
past 50,000 years. 
However, there is an 
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which is probably 
ideal, [the limit is] 
somewhere on this side of a million years — 
and it’s much more realistic to say half a mil- 
lion years, maximum,’ says Paabo. As such, any 
hope of obtaining usable genomic data from 
our early African ancestors is a pipe dream, 
and attempts to characterize hominin genetic 
evolution generally focus on our closest extant 
kin — the chimpanzee and bonobo. 

Some of the best evidence might be found 
lining the fossilized jawbones of our ances- 
tors. Peter Ungar, a paleoanthropologist at the 
University of Arkansas, has been using digital 
analysis to chart the ‘landscapes’ of ancient 
teeth down to the subtle abrasions that cover 


Evidence of starch consumption, including granules 


of sorghum and African potato (Mozambique). 


the chewing surfaces. “Those scratches are the 
actual result of a hominin passing food across 
its teeth, and we can relate that to what the 
animal was adapted to doing,” he says. 

Based on a growing collection of both 
H. habilis and H. erectus samples, Ungar 
sees a striking transition to teeth that are 
thinly enamelled and highly textured, which 
are clues to a diversification in diet. “If our 
Homo ancestors were processing their food 
outside of the mouth more with tools, then 
you're not going to get the same selective 
pressures to maintain big, thickly enam- 
elled, flat teeth,” he says. “Teeth with thinner 
enamel and more relief are actually better for 
shearing and grinding tougher foods, like 
meat and leaves.” He suggests that although 
individual H. erectus may not have necessar- 
ily indulged in a diverse diet, they developed 
a capacity to rely on a broad array of ‘fallback 
foods’ — a skill that would have proved use- 
ful in the rapidly changing climate of the early 
Palaeolithic, and enabled humanity to settle far 
beyond the continent of Africa. 

Braun considers this a reasonable theory, but 
he also appreciates the need for further inves- 
tigation into the nutritional building blocks of 
this increasingly diverse diet. “For every 10 years 
of field work, we answer one or two questions,” 
he says. “It’s going to require a lot more boots 
on the ground’ In the meantime, anthropolo- 
gists and archaeologists will have to continue 
to content themselves with reconstructing the 
Palaeolithic buffet one course at a time. m 


Michael Eisenstein is a journalist in 
Philadelphia 
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Edible advice 


Diet-related illnesses are some of the biggest killers today. 
Can we tailor our food intake to prevent these diseases? 
Large international projects are underway to find out. 


BY FAROOQ AHMED 


4 oo much tea can treble cancer risk in 

Tien . “Tea could cut risk of ovarian 

cancer. Just two examples of the fre- 

quent contradictory newspaper headlines that 

confuse the public about the health benefits 

— or risks — of food and confound genuine 
nutrition-related research. 

For some diseases such as diabetes the link 
with food is subtle. “Although we know that 
dietary factors are related to the risk of diabetes, 
there are a lot of inconsistencies between stud- 
ies in terms of what precise micronutrients or 
macronutrients associate with the disease. We're 
quite limited in terms of the data; explains Nick 
Wareham, head of the epidemiology unit at the 
UK’s Medical Research Council. 

Using new tools and methodologies, ambi- 
tious projects are underway to make up this 
shortfall. One such effort, which Wareham 
coordinates, is InterAct — a multinational 
study to define how diet and lifestyle influ- 
ence risk of type 2 diabetes. This disorder of 
blood glucose regulation is a growing problem 
in Europe, afflicting nearly 40% of the popu- 
lation at some point in their lifetime. InterAct 
estimates that the diabetes accounts for as much 
as 10% of health care costs in Europe. 


Through endeavours such as InterAct, 
researchers are starting to expose the complex 
interplay of genetics, diet and disease, and bring 
order to the confusing array of nutritional 
information. 

InterAct began in 2006 as part of the Euro- 
pean Community’s sixth Framework Pro- 
gramme. It has a budget of €10 million and 
involves more than 12,000 patients recently 
diagnosed with diabetes across 10 countries — 
nine in Europe plus India. Such a broad cohort 
is important. “Sometimes variation within a 
country is not so great,” says Wareham. “Inter- 
national efforts give you heterogeneity in the 
lifestyles of patients, especially in the diet, and 
that’s a major advantage.” This diversity pro- 
vides scientists with more variables to study 
as they attempt to untangle what factors are 
responsible for causing disease. 


THE BIGGER THE BETTER 

This research is part of the largest diet and 
disease study ever undertaken: the European 
Prospective Investigation into Cancer and 
Nutrition (EPIC). Initiated in 1992, EPIC has 
recruited more than halfa million people. Par- 
ticipants are physically examined at one of 23 
centres, complete lifestyle surveys including 
detailed diet questionnaires, and have their 
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blood tested. Their DNA is scanned for dis- 
ease-related genes using techniques that can 
detect hundreds of thousands of genetic vari- 
ants in large numbers of individuals. 

“Large-scale projects can really be a cata- 
lyst to bring together multiple centres to share 
instruments,” says Wareham. “InterAct has 
benefited greatly from the huge EPIC cohort 
and access to those technologies.” 

Another large-scale study, a parallel to Inter- 
Act though not part of EPIC, is Interheart — 
which examined the link between dietary pat- 
terns and heart-attack risk. Between 1999 and 
2003, the Canadian-led study recruited 5,761 
patients and 10,646 control subjects, living in 
52 countries, across six continents. Using ques- 
tionnaires, physical examinations, and blood 
analysis, the teams compiled data on people 
including demography, diet, anthropometric 
measurements such as body mass index and 
biomarker levels including cholesterol and 
lipoproteins. 

Interheart researchers concluded that the 
globalization of a Western pattern diet — high 
in animal products, fried foods, and salty 
snacks — is responsible for a third of the risk of 
heart attack worldwide". A ‘prudent’ diet rich in 
fruits and vegetables reduced risk regardless of 
location. Prior to Interheart, few epidemiologi- 
cal studies linked dietary patterns in ethnically 
diverse populations and cultures to disease. 
Research like this “is crucial if we truly want to 
understand these diseases, because they mani- 
fest differently in European and other popu- 
lations,” says nutritional geneticist Jim Kaput, 
who also serves as director of the Division of 
Personalized Nutrition and Medicine at the US 
Food and Drug Administration. 


CROSSED PATHWAYS 

InterAct and Interheart both demonstrate that 
the metabolic pathways at the epicentre of 
dietary-related illnesses, such as diabetes and 
cardiovascular disease, are strongly related. 
Research on one can uncover clues to the other. 
“Factors like blood pressure, cholesterol and 
triglyceride levels, which are predictive of cor- 
onary heart disease, are also associated with 
diabetes,’ notes Wareham. 

Leafy vegetables, such as lettuce and spin- 
ach, are core components of the prudent diet 
as identified by the Interheart study. These 
vegetables are enriched in polyunsaturated fatty 
acids (PUFAs) — essential macronutrients also 
found in some types of fishes, nuts and cheese. 
Two types of PUFAs in particular, omega-3 and 
omega-6, are powerful dietary components 
because they can change gene expression, both 
directly and indirectly. PUFAs “act more like 
hormones than like typical food,’ says nutrition 

scientist Donald Jump of 
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For example, PUFAs 
have two ways to 
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modulate gene activity and lower the 
levels of fatty acids and triglycerides 
in the liver: they can bind and acti- 
vate a family of transcription factors 
called peroxisome-proliferator-acti- 
vated receptors (PPARs) to speed up 
the breakdown of fatty acids; PUFAs 
can also deplete another transcrip- 
tion factor, sterol regulatory ele- 
ment binding protein-1 — thereby 
curtailing fatty-acid synthesis. This 


Relative change in heart attack risk (%) 


two-pronged attack provides a sig- a 
nificant health benefit. “Along with 
cholesterol? explains Jump, “elevated 
triglycerides are a common target for 
the management of atherosclerosis, -20 
cardiovascular disease and stroke” 

Although most research on 


PUFAs has focused on their con- 
nection to cardiovascular disease, by 
manipulating an enzyme involved in 
fatty-acid metabolism Jump’s team 
has demonstrated that PUFAs can also balance 
blood glucose levels, suggesting a potential 
treatment for type 2 diabetes”. 

“Dietary omega-3s are not usually thought 
of as a treatment for elevated blood sugar,” 
says Jump. Yet by studying diet and its effect 
on metabolic pathways, these types of links are 
being uncovered. 

After diet and metabolism, a third element 
in development of diet-related disease is the 
genome. “In humans, genetic background 
plays a role in either responding well, or not, to 
particular nutrients,’ says Jump. Adding global 
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Diets high in certain food types carry an elevated risk of heart attack. 
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genetic diversity to the mix greatly increases 
the complexity of the research. 

The International HapMap Project, a 
database of genetic variation, began in 2002 
and stores data from Canada, China, Japan, 
Nigeria, the United Kingdom and the United 
States. So far, the project has identified tens of 
millions of single nucleotide polymorphisms 
(SNPs) associated with both disease and drug 
response. SNPs in PPAR genes that are regu- 
lated by PUFAs affect, among other things, the 
ability to lose weight, a crucial step to control- 
ling diabetes. One SNP in particular has been 
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found to account for 7% of the vari- 
ation in people’s weight loss’. 

In 2010, InterAct recruited the 
ten-thousandth person to their 
search for markers that reveal the 
roles of obesity and exercise in the 
risk of developing diabetes. Analyses 
of their genomes will be published 
in 2011, and Wareham is confident 
this approach will uncover new 
interactions. “We're using a discov- 
ery approach on a mass scale,” he 
says. “We don’t focus so much on 
the expression of particular genes, 
but on the interplay between innate 
susceptibility and dietary and life- 
style factors.” 

Such a link would fill a gap in our 
knowledge. “Hypothetically,” muses 
Wareham, “the genome-wide influ- 
ence of dietary and lifestyle factors 
may account for the heritability we 
have seen in diabetes that remains unaccounted 
for on the basis of simple genetic variation.” 

Subtle changes are lost in the background 
noise of standard single-gene studies. But they 
do have an impact — not least of all for our defi- 
nition of disease and the way that clinical trials 
are designed. “Geneticists,’ says Kaput, “have 
been treating all of us like coins when it comes 
to deciding whether we have a disease: heads or 
tails — are you or are you not diabetic? That's 
how many studies are designed.” 

There is a need for new ways to interpret dis- 
ease that recognize the contribution of genes 


DEVELOPING WORLD NUTRIGENOMICS 


Reversing the health and nutrition relationship. 


As globalization is exporting fatty fast-food 
diets around the world, malnutrition is 
rampant in many developing countries. 
According to Médecins Sans Frontiéres, 
malnutrition causes 60% of deaths in 
children under the age of five in developing 
nations. New technologies that provide 
nourishment while treating diseases could 
save millions of lives. 

That's the stark reality that led Raymond 
Rodriguez to launch the Global HealthShare 
Initiative (GHSI). “We started to look at 
health disparities in racial, ethnic and 
economically disadvantaged communities,” 
explains Rodriguez, director of the Center of 
Excellence for Nutritional Genomics at the 
University of California, Davis. “Often the 
people who need food and drugs the most 
are the last to get them.” 

Lack of proper nutrition opens the door 
to disease. “Malnourished individuals have 
reduced immunity, and thus vaccines 
are less effective,’ says Somen Nandi of 


Malnutrition impairs vaccine efficacy. 


GHSI. Nandi is leading efforts to develop 
an international network of researchers, 
investors, non-governmental organizations 
and drug makers to combat diseases with 
nutrition-based therapeutics. “We’re merging 
the concepts of nutrition and immunity,’ says 
Nandi. 

One tangible benefit of this new way of 
thinking is a novel rice-based matrix in 


development as a delivery base for a vaccine 
against cholera and diarrhoea. This could 
help sustain the malnourished and bolster 
their immunity, while immunizing against the 
diseases. 

GHSI’s vaccine development projects also 
recognize the economic factors involved. 
Rodriguez’s and Nandi’s ambitious goal 
is to help create sustainable economies in 
countries where diarrhoeal diseases are 
prevalent, such as Bangladesh. They hope 
to identify and develop sites in resource- 
limited countries where therapeutics can be 
formulated, manufactured and distributed. 

“We think that GHSI will create an opport- 
unity for the four billion people who are not 
a part of the global economy to enjoy better 
health and a better standard of living,’ says 
Rodriguez. Thus the fruits of nutrigenomics 
research could not only help Westerners 
cope with a diet of excess, but also bring 
better lives to the impoverished people in 
developing nations. 
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and metabolic pathways. Metabolites are small 
molecules — amino acids, vitamins and other 
chemicals — in circulation and influenced 
by both genes and nutrient intake. Metab- 
olomics, or the study of these metabolites, 
offers perhaps the best opportunity to observe 
these interactions in a minimally invasive 
manner. 

Metabolic phenotypes can be very finely 
segregated, as demonstrated in an analysis of 
urinary metabolite patterns found in thousands 
of individuals in China, Japan, the United King- 
dom and the United States. Not only did East 
Asians have a different pattern of metabolites 
from Western populations, but individuals 
from northern China could be differentiated 
from those in southern China. Both Chinese 
groups were distinct from Japanese, who were 
in turn different from Japanese Americans’. 
People who consumed a lot of meat, as is com- 
mon in Western diets, had elevated levels of 
biomarkers indicative of high blood pressure 
compared with people who have a primarily 
vegetarian diet. 

InterAct is also searching for novel bio- 
markers that accumulate as an individual’s 
risk of diabetes rises. When combined with 
epidemiological studies, this type of metabolic 
phenotyping could lead to the identification of 
biological red flags for individuals, even before 
disease manifests. Biomarker metabolites might 
also be therapeutic targets one day. 


BREAKING DOWN SILOS 

While large-scale scientific projects such as 
InterAct and Interheart have had success, bar- 
riers still exist to international collaborations. 
Researchers occasionally encounter a lack of 
willingness or an inability to share informa- 
tion. “It has sometimes been a challenge to 
convince colleagues who run the individual 
centres that by working together we end up 
with better science,’ says Wareham. 

Kaput agrees. He suggests that biologists 
take a page from the physicists’ handbook. 
“They built the Large Hadron Collider, thereby 
working across disciplines’, he says, “but we 
still haven't made the silos go away in the bio- 
logical sciences community.’ 

Wareham has faith in the technology- 
driven approach that encourages and 
facilitates collaboration. These major projects, 
he says, can bring different disciplines closer 
together — as they have in the genetic HapMap 
project. “The ability to measure multiple SNPs 
at very low cost on a mass scale revolutionized 
that field, and I think that’s where we're headed 
for other risk factors such as diet and nutrition,” 
contends Wareham. 

Such a large-scale, system-wide approach 
is being used by Kaput and FDA chemist 
Carolyn Wise. They are considering early 
environmental influences, micronutrient 
availability, metabolic and regulatory pathways 
and genome-wide association maps as they 
try to define combined genetic-metabolic types 
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for people in terms of obesity and type 2 
diabetes*. This approach can be emulated 
and scaled in other studies to help researchers 
work together and share information as they 
try to make sense of huge amounts of data. 


PREVENTION IS BETTER THAN CURE 

As these nutrigenomic studies begin to 
classify individuals into specific groups based 
on the interplay of their lifestyle, metabolic 
pathways and genetic variants, tailored diets 
may become early therapeutic interven- 
tions. Personalized diets might even guide 
people genetically at risk for diabetes, but not 
yet in a pre-diabetic state, to help them avoid 
developing the disorder by fine-tuning what 
they eat. A well-regulated ounce of prevention 
could obviate the need for a cure. 

However, the development of personalized 
diets has been prematurely promised before. In 
the early 2000s, a slew of companies claimed 
to be able to provide personalized nutritional 
advice based on genetic tests. An investigation 
by the US Government Accountability Office 
in 2006 found that these companies “misled 
consumers” and provided only generalized 
advice. The US Senate Special Committee 
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on Aging convened a hearing that further 
criticized these direct-to-consumer genetic 
tests. Unable to secure funding, several of 
these companies went bust. 

Here, says Kaput, is where the FDA’s Divi- 
sion of Personalized Nutrition and Medicine is 
ahead of the curve. “Right now, we don't have 
a product to regulate. We're not sure where the 
field is going necessarily, but when products 
come in for possible FDA regulatory activities, 
we will have the research background to help 
the regulatory centres make their evaluation.” 

Teasing out the relationship between food 
and disease is a tricky task, one that involves 
tens of thousands of people and encompasses 
hundreds of nutritional and genetic factors. It is 
not likely to provide simple or quick fixes either, 
meaning that for now at least the ‘tea causes 
cancer’ stories can safely be ignored. = 


Farooq Ahmed is a science writer in New York. 
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DIVERSITY 


Of beans and genes 


Several human genes involved in digestion have diverged along cultural lines. Research 
suggests these adaptations influence the range of foods tolerated and even certain diseases. 


BY MICHAEL EISENSTEIN 


athaniel Dominy was surprised to find 
Ne diet-related genes, like people, 

sometimes simply repeat themselves 
to get a point across rather than change the 
message. 

In 2007 while most evolutionary biologists 
were looking for evidence of selection in the 
form of genetic mutations, Dominy and col- 
leagues learned that people with high-starch 
diets had additional copies of the gene cod- 
ing salivary amylase and that these repeats 
increased production of the carbohydrate- 
processing enzyme. “Few would have expected 
at the time that [these repeats] could have any 
effect at all — and they hada big effect,” recalls 
Dominy, an anthropologist at Dartmouth 
College in New Hampshire. 

This discovery also offered proof to the 
growing number of evolutionary geneticists 
who believe that culture-specific factors, 
such as diet, have had as powerful an effect on 
human evolution as more obvious externalities 
like climate and habitat — with some even sug- 
gesting that these factors could have accelerated 
the overall evolutionary pace. “I don't think we 
have the data yet to make those claims,” cau- 
tions Mark Stoneking, a population geneticist at 


the Max Planck Institute in Leipzig, Germany. 
“[But] there certainly has been recent evolu- 
tion in modern humans because of responses 
to natural selection, and culture may very well 
be playing a role in a lot of those” 


THE GENETICS OF LUNCH 

Even if you can enjoy a cold glass of milk with- 
out then feeling sick to your stomach, chances 
are you know somebody who can't. In fact, 
adult lactose intolerance is the biologically 
‘normal state of affairs. “The general pattern 
in mammals is to lose lactase expression after 
weaning,’ explains Dallas Swallow, a geneticist 
at University College London. 

Nevertheless, adults with ‘lactase persistence 
are widespread in many parts of the world. For 
example, lactase persistence is characteristic 
of 89%-96% of Scandinavian and British peo- 
ple, is widespread among pastoralist cultures 
in Africa and the Middle East, but appears in 
only 1% of Chinese individuals. 

Although one single nucleotide polymor- 
phism (SNP) affecting lactase gene expression 
accounts for the vast majority of European 
instances, this trait seems to have arisen inde- 
pendently in different regions of Africa as a 
result of several distinct yet tightly-clustered 
variations within a regulatory segment of the 


lactase gene. “That’s convergent or parallel evo- 
lution,’ says Swallow. “The same phenotype is 
being selected, with different mutations causing 
that phenotype.” 

Each of these variants is thought to have 
emerged within the last 10,000 years, roughly 
coinciding with the emergence of agriculture 
and dairy farming, and conferring obvious 
advantages on those cultures. “Milk is nutri- 
tionally good, and if you don't have lactase you 
can't digest the main carbohydrates in it: you 
might get diarrhoea or flatulence, and you've 
lost a source of food, water and calcium,” says 
Swallow. “In the context of African tribes, the 
most plausible thing is that it was a source of 
clean, nutritious liquid” 

Most geneticists cite lactase persistence as 
a leading example of recent human evolution 
driven by shifts in culture and diet. “This hap- 
pens to be a ‘low-hanging fruit?’ says Sarah 
Tishkoff, a geneticist at the University of Penn- 
sylvania. “It’s a Mendelian trait and it left a really 
strong selection signature.” Identifying other, 

equally clear examples 


NATURE.COM has proven challenging, 
Leammoreabouthow although the subsequent 
dietary adaptations amylase breakthrough by 
affect health Dominy and colleagues 
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> such traces are there to be found. 

Today, many people enjoy starch-rich diets 
as a matter of choice, but for early humans, 
tubers and other starchy plants might have 
been an essential staple in lean times (see 
The first supper, page S8). “Amylase is the 
only enzyme that can hydrolyze starch,’ says 
Dominy. “If you can produce a lot of amylase, 
you have a big advantage in the sense that you 
can extract and assimilate carbohydrates almost 
instantaneously at the level of the mouth” 

The number of copies of the salivary amy- 
lase gene, AMY, was already known to vary 
among individuals. In partnership with Anne 
Stone and then graduate student George Perry 
at Arizona State University, Dominy demon- 
strated that not only does this copy number 
directly correlate with enzyme levels, but the 
average copy number within a population also 
correlates with the starch content of their tra- 
ditional diet. For example, the Japanese rou- 
tinely consume large amounts of rice and other 
starch, whereas the Yakut, a Siberian hunting 
and fishing culture, have a diet based on fish 
and meat; these differences are reflected at the 
level of the AMY1 gene in Japanese and Yakut 
populations. “Even though they are closely 
related genetically, and geographically not 
separated by a great distance, there’s a dif- 
ference in the number of copies in these two 
populations on average,” says Dominy. 


SUFFERING FOR HEALTH 
The shift from an active foraging lifestyle to 
a more sedentary agricultural existence also 
appears to have introduced selective pres- 
sures, as populations struggled to survive 
nutritional deficiencies. Some intriguing but 
enigmatic signs of lifestyle-specific adapta- 
tion have been detected in the gene encoding 
N-acetyltransferase 2 (NAT2), an enzyme that 
is best known for its role in drug metabolism, 
but which also contributes to the processing of 
toxins ingested from plants and well-cooked 
meat. In a series of recent studies, geneticist 
Lluis Quintana-Murci of the Institut Pasteur 
and colleagues investigated the extent by which 
different populations express NAT2 vari- 
ants that acetylate — and thereby help break 
down — target molecules quickly or slowly. 
“We showed that most hunter-gatherer popu- 
lations present fast-acetylation alleles,’ he says, 
“whereas the slower acetylators are very com- 
mon in farmer-descended populations, like 
with most Europeans and particularly in the 
Middle East.” NAT2 is also associated with the 
metabolism of folate, the natural form of folic 
acid, normally obtained from leafy greens or 
animal liver. Quintana-Murci’s team proposed 
a model in which the sharp drop in folate 
intake associated with a shift to a grain- and 
cereal-rich diet favoured the emergence of alle- 
les that suppress use of folate reserves, although 
he emphasizes that this is purely speculative 
until further data are available. 

Both versions of the NAT2 enzyme carry 


certain disadvantages, as acetylation can actu- 
ally enhance rather than reduce the toxicity of 
certain compounds: fast acetylation is linked 
with colon and lung cancers, whereas slow 
acetylation is associated with prostate and blad- 
der cancers. Therefore, any nutritional benefits 
are likely to be closely balanced against the 
potentially harmful outcomes of NAT2 vari- 
ation. 

There are a number of other instances 
where selective pressures appear to have 
resulted in a trade-off. For example, although 
the twenty-odd T2R proteins involved in bitter 
taste perception represent a potent early warn- 

ing system for harm- 


Mosttraitsare _ful compounds, the 
complicatedand _ genes encoding these 
multifactorialin factors also exhibit a 
nature. striking level of vari- 


ability (see More than 
meets the mouth, page S18). Unusual patterns 
of distribution have been observed for sev- 
eral variants that could alter the sensitivity of 
the mouth to bitter chemicals. “It’s clear that 
there are ethnic differences in the composition 
of T2R haplotypes,” says Wolfgang Meyerhof, 
a geneticist at the German Institute of Human 
Nutrition in Nuthetal. 

Ina study of taste variation in central Afri- 
can populations by Meyerhof and colleagues, 
one low-sensitivity variant of the T2R16 bitter 
receptor, which normally responds to cyano- 
genic glycosides found in the starchy tuber 
cassava, was found to be unexpectedly com- 
mon. These glycosides are metabolized in the 
gut to release toxic cyanide. The researchers 
speculated that the health costs of consuming 
potentially toxic compounds must be balanced 
by some sort of positive selection, perhaps 
arising from enhanced resistance against the 
malarial parasites that are widespread in this 
region, to sustain this allele in the population. 

In fact, there are several instances where the 


benefits of lowering pathogen susceptibility 
are apparently sufficient to select for other- 
wise deleterious alleles. “Infectious disease is 
probably one of the strongest selective forces 
in the past or ever,’ says Tishkoff. She points 
to the example of glucose-6-phosphate dehy- 
drogenase (G6PD) deficiency, a widespread 
enzymopathy associated with blood cell defects 
and potentially severe toxic reaction to foods 
including the fava bean (shown in main image, 
page S13). The gene variants associated with 
G6PD deficiency, also known as ‘favism, are 
widespread in several ethnic groups that rou- 
tinely eat these beans. Their prevalence, in spite 
of the near-term dietary and health costs, could 
result from the protection these variants confer 
against malaria. 

As with lactase persistence, the strong 
adaptive advantages of this phenotype are 
demonstrated by its independent appear- 
ance in diverse populations; distinct G6PD- 
deficient alleles have emerged in Africa, the 
Mediterranean and the Middle East. More 
recently, Quintana-Murci and colleagues deter- 
mined that an allele of this gene is prevalent in 
Southeast Asia, which results in only moderate 
enzyme deficiency, appears to protect against 
Plasmodium vivax — an unexpected finding, 
as P vivaxis seldom lethal and was presumed 
to represent a much less potent force for short- 
term human evolution than its highly dan- 
gerous relative P falciparum. “It’s about the 
consequences,’ says Quintana-Murci. “P vivax 
could be important in childhood, or in women 
who are pregnant and infected — maybe they 
wont die, but their babies are born with low 
birth weight, which eventually weakens them 
and raises their chances of dying.” 

However, not all phenotypes can be directly 
linked to genetic variation. As years of genome- 
wide association studies have demonstrated, 
most traits are considerably more complicated 
and multifactorial in nature, and tracking 


Milk provides many nutrients to those who can tolerate it (see map on opposite page). 
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MILK AND STARCH CONSUMPTION 


Global distribution of genes related to ability to digest starchy foods and lactose in milk. 
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down these complex changes will require alter- 
native allele-hunting tactics. 


ATANGLED WEB 

University of Chicago geneticist Anna Di 
Rienzo recently tried to identify allelic vari- 
ants that differ in frequency among populations 
residing in similar geographic regions or eco- 
systems but who have distinct diets or modes 
of subsistence, such as farming or foraging. 
Through this approach, her team uncovered 
various hints of genetic adaptation in carbo- 
hydrate metabolism and folate production, 
associated with the adoption of diets based on 
roots and tubers. Conversely, cultures with a 
cereal-rich diet were more likely to produce a 
truncated, hyperactive version of PLPR2 — an 
enzyme responsible for breaking down plant 
glycolipids — than their non-cereal-consuming 
counterparts. “There is a consistent frequency 
shift between populations,” says Di Rienzo. 
“The stop codon [in PLPR2] occurs always at 
higher frequencies in populations that have 
a cereal-rich diet compared to populations 
that don’t and yet live in the same geographic 
region” 

Yet it remains a challenge to piece together 
such minor genetic variations scattered 
throughout the genome. “These aren’t muta- 
tions that will knock you dead,’ says Di Rienzo. 
“They make subtle changes to gene function or 
expression, and detecting those subtle changes 
can really be quite hard.” 

For more recent adaptations, the muta- 
tions can also be very rare, making it difficult 
to detect clear patterns. Even when the data 
seem to suggest the presence of selective pres- 
sure behind a given variant, it is essential to 
have a solid understanding of the cultural his- 
tory of the region to eliminate demographic 
biases. “Ifa lot of the individuals you've tested 
have the same great-grandparents, it’s quite a 


different story from if they were relatively unre- 
lated,’ explains Swallow. 

Most importantly, studies need to dem- 
onstrate clear functional contributions from 
a particular variant or subset of variants, 
and arrive at plausible reasons for why these 
changes are adaptive in some cases but not in 
others. “We want to know what the biology is 
that’s being affected by these unusual patterns,” 
says Stoneking. “How many of them are real, 
and how many are false positives, and what are 
the underlying stories? That's sort of where the 
field is a bit stuck at the moment” 


IT TAKES A COMMUNITY 

Clearly, scouring for signals of recent 
evolution amid the tens of thousands of 
interconnected human genes and regulatory 
regions can be compared to finding the prover- 
bial needle in the haystack — but what if that 
haystack is far bigger than most people think? 

Jeremy Nicholson, a biological chemist at 
Imperial College London, points out the tre- 
mendous diversity of the intestinal microbial 
flora, citing a report in 2010 which showed that 
Europeans each carry a complement of at least 
160 bacterial species, with more than 536,000 
bacterial genes between them — well over 
20-times the human gene count’. “It actually 
should be thought of as a multicellular organ- 
ism with a very large genome,’ he adds. 

Even with our limited understanding of 
the microbial communities that thrive in our 
digestive tract and elsewhere in the body, 
it’s increasingly clear that their net genomic 
output is inextricably linked with our own 
metabolic function, and the composition and 
activity of these communities is a direct by- 
product of our environment, culture and diet. 
“The gut microbial community can be viewed 
as a metabolic organ — an organ within an 
organ; they sense, adjust to, and process 
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components of our diet, and their metabolic 
products profoundly influence our physiology,’ 
says Jeffrey Gordon, a microbiologist at 
Washington University in St Louis, Missouri. 
“Tt’s like bringing a set of utensils to a dinner 
party that the host does not have.” 

Nicholson has already found some compel- 
ling evidence that genes expressed by the gut 
flora have effects that reach far beyond the 
digestive tract. “We've found deep compart- 
mental connections between microbial status 
and bile acid metabolism, he says, “[And] 
there are some staggering connectivities 
between blood pressure and gut microbial 
metabolites.” 

Research from Gordon's lab has shown there 
are differences in the sets of bacterial species 
that reside in the guts of individuals, even iden- 
tical twins. “Certainly less than 10% — and it 
might even be less than 2%—of the bugs that are 
in you are also in me,’ says Nicholson. Gordon 
and others are confident that the impact of 
cultural variation is at least as strong. 

Our understanding of the genetic basis 
of even relatively well-characterized phe- 
nomena pertaining to dietary variation, like 
lactase persistence, could be confounded by 
the impact of these commensals. “There are 
Chinese students who come to the West who 
can drink quite a lot of milk even though they 
come from a genetic background where they're 
not lactase persistent,’ says Swallow. “We think 
that’s due to adaptation of the intestinal flora” 
It also appears possible that the consider- 
ably smaller, but potentially equally diverse, 
microbial communities in our mouths may play 
an important role in the early stages of meal 
digestion, as indicated by a recent study that 
suggests oral bacteria may facilitate the pro- 
cessing of wheat gluten. 

One of the most striking findings comes 
from a recent study by a team at France's Centre 
National de la Recherche Scientifique, present- 
ing strong evidence that Japanese individuals 
can digest seaweed carbohydrates more effi- 
ciently”. This was made possible by an ancestral 
gene transfer event from kelp-borne bacteria 
that endowed their gut flora with the capacity 
to produce porphyranase and agarase enzymes. 
This adaptation is seemingly absent in North 
Americans who have not historically consumed 
raw kelp. Microbe-watchers like Nicholson sug- 
gest that this study could be a strong indicator of 
the future, as the research community begins to 
come to terms with the extent to which human 
genetic effects on diet might be overwhelmed 
by the bacteria we carry. “It’s a piece of genius,” 
he says. “It’s something I use in my slide pres- 
entations now to worry geneticists.” = 


Michael Eisenstein is a journalist in 
Philadelphia. 


1. Qin, J. et al. Nature 464, 59-65 

2. Hehemann, J. H. et al. Nature 464, 908-912 (2010). 
3. Itan, Y. et al. BMC Evolutionary Biology 10,36 (2010). 
4. Perry GH et al. Nat. Genet. 39 (10), 1256-1260 (2007). 


23/30 DECEMBER 2010 | VOL 468 | NATURE | S15 


Eighteenth century chemist Antoine Lavoisier investigates whether exhaled breath is analogous to the fumes of a combustion engine. 


The changing notion of food 


The pioneers of nutrition research determined the energy content of food and also helped to 
overturn misconceptions about various diseases that plagued humankind. 


BY NED STAFFORD 


utrigenomics — and the rest of mod- 
Ne nutrition science — stands on 

foundations laid in the late eighteenth 
century. 

That is not to say that nobody had taken 
an interest before then in how food works. 
The ancient civilizations of Egypt, Greece, 
Rome, Persia, China and India were aware 
of a link between food and health. “They all 
had their food rules, many of which are still 
valid today,’ says Claus Leitzmann, a human 
nutritionist at the University of Giessen in 
Germany. “The ancient Egyptians used gar- 
lic medicinally.” 

Some of our food-truths hark back millen- 
nia. The ancient Greek physician Hippocrates 
recommended that food should be thoroughly 
chewed before swallowing, and consumed in 
moderation to maintain good health. In the 
Middle Ages, the German nun and Christian 
mystic Hildegard of Bingen “knew a lot about 
food”, says Leitzmann. “She made some very 
intelligent recommendations,’ such as eating 
cooked rather than raw foods. 

But before the eighteenth century there 
was little scientific investigation into the 
composition of food or how the body processes 


it. The researchers of the time were “dependent 
on experimental observation’, says Leitzmann. 
Their method was ‘feed and watch. It was 
French chemist Antoine Lavoisier, regarded 
as the father of modern chemistry, who first 
conducted the research that led to today’s 
science of nutrigenomics. 


FOOD AS FUEL 

Lavoisier was one of the first scientists to 
design laboratory equipment to test what 
happens to food after it is swallowed. Before 
his work, scientists knew that the weight of 
ingested food exceeded the weight of excreted 
faeces and urine. They attributed this loss to 
perspiration. But Lavoisier believed that food 
was fuel and that the body, like the fuel-burn- 
ing engines being developed at the time, must 
expel carbon dioxide as a product of combus- 
tion. He suspected that exhaled carbon dioxide 
accounted for this lost matter. 

To test his theory, in the early 1780s Lavois- 
ier invented a new type of device — the ice 
calorimeter. It was composed of an outer shell 
packed with ice, to maintain a constant tem- 
perature of 0 °C, encasing a chamber housing 
a guinea pig. The animal’s body heat melted 
the ice. By weighing water flowing out of the 
calorimeter, Lavoisier was able to estimate 
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metabolic heat and compare it with the heat 
produced by a lit candle or burning charcoal. 

His theory proved correct. Lavoisier declared: 
“respiratory gas exchange is a combustion like 
that of a candle burning” 

In today’s calorie-counting world, this does 
not sound like much ofa revelation. But at the 
time it was a breakthrough. “It was theoreti- 
cally important to realize that the body needed 
energy to function and that one major func- 
tion of food is to supply it,” says Elizabeth 
Neswald, a science historian at Brock Uni- 
versity in Ontario, Canada. “It was a basis for 
determining what someone needs to survive; 
what leads to weight gain, what leads to weight 
loss, what enables physical labour and what 
the relationship between food and physical 
labour is.” 

Lavoisier’s research also emphasized the 
importance of food composition and of realiz- 
ing that faeces, urine, perspiration and respi- 


“art ve ration are an essential 
arly NUETHLON part of the equation. 

scientists spent “These early nutri- 
a large part tion scientists spent 
of their time a large part of their 
inspecting time — or their assist- 
other people’s ants’ time — inspecting 
excrement.” and analysing other 
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people's excrement,” says Neswald. “In nutrition 
experiments, it was vital to assess the differences 
between input and output — food going in and 
all products coming out.” 

This method, known as ‘balance trials, was 
pioneered in the 1830s by French chemist Jean- 
Baptiste Boussingault. He conducted balance 
trials for nitrogen — a constituent element of 
proteins — by comparing the nitrogen content 
of hay, oats and potatoes fed to cows and horses 
with the animals’ excrement and, in the case 
of cows, milk. He showed that animal feed 
contained sufficient nitrogen to meet bod- 
ily requirements, ending speculation that 
additional nitrogen was obtained from the 
atmosphere. 


MACRONUTRIENT EXPLORATION 

By the mid-nineteenth century, scientists had 
learned that the primary elements in food are 
carbon, nitrogen, hydrogen and oxygen, and 
had divided food constituents into four main 
types: carbohydrates, fats, protein and water. 
Yet the chemical make-up of the first three 
classes was unknown. 

It was a German chemist, Justus von Liebig, 
who made the next leap forward. The preco- 
cious von Liebig (appointed professor 
at the University of Giessen at age 21) 
invented the ‘kaliapparat,, a special 
piece of glassware for analysis of car- 
bon in organic compounds. 

Von Liebig’s laboratory, arguably 
the first teaching laboratory, attracted 
scientists from around the world. He 
helped train a generation of nutritional 
researchers whose work would carry 
on into the early twentieth century. 
In the 1860s, for example, two of von 
Liebig’s protégés — physiologist Carl 
von Voit and chemist Max Joseph von 
Pettenkofer — obtained funding from 
the Bavarian government to build a 
state-of-the-art respiration chamber 
large enough to hold a person. The chamber 
could measure the daily balances of both car- 
bon and nitrogen and thereby estimate human 
protein requirements. 

Neswald notes that most of the nutrition 
research of this period focused not on the 
health of individuals, but rather on finding 
the cheapest, easiest methods to feed “insti- 
tutionalized and impoverished populations” 
to prevent food riots. Von Voit, says Neswald, 
visited prisons and workhouses “to assess 
what people were fed and what their state of 
health was, with the aim of providing dietary 
guidelines”. 

The concept of food as fuel, which contains 
important dietary components, was further 
refined in the United States. Agricultural 
chemist Wilbur Olin Atwater had spent time 
in von Voit’s laboratory as a postdoc, return- 
ing to the United States in 1871 to spearhead 
nutrition science. Atwater spent five years in 
the 1890s building a respiration calorimeter 


larger than von Voit’s and able to hold humans 
for longer than a day. His measurements were 
so precise that his energy equivalents for pro- 
tein, fat and carbohydrate are still used today. 
Atwater was first to adopt the word ‘calorie’ as an 
energy unit for food. (A calorie of food energy is 
actually equivalent to 1000 calories of thermal 
energy.) 


SMALLER AND SMALLER 

Scientists soon began to realize that in addi- 
tion to supplying energy and macronutri- 
ents, food also played a more subtle role in 
health and disease. Japanese physician Takaki 
Kanehiro, who studied in the 1870s at St Tho- 
mas’s Hospital Medical School in London, 
was a rare exception to the nineteenth cen- 
tury German dominance of nutrition. “He was 
the first person to show that beriberi arises 
from malnutrition,’ says Katsuhiko Yokoi, a 
human nutritionist at Seitoku University in 
Japan. Previously beri-beri was thought to be 
an infectious disease. 

By the early twentieth century, other scien- 
tists around the world had begun to explore 
links between nutritional deficiencies and 
other ailments, including rickets and 


Atwater-Rosa calorimeter used to measure human energy demands. 


scurvy. Unable to explain these afflictions in 
terms of fats, protein or carbohydrates, some 
scientists began to suspect the existence of 
another class of food ingredients. 

It was Polish biochemist Casimir Funk 
who in 1912, while studying beriberi, iso- 
lated thiamine, the nutrient that protects 
against this disease. He called the substance a 
‘vital amine, which soon became ‘vitamin. 

The battle against scurvy is an example 
of science later refining a nutrition-related 
disease association. In the mid-eighteenth 
century, Scottish naval physician James Lind 
found that scurvy could be treated or pre- 
vented by eating citrus fruits. But he incorrectly 
thought that sea air was to blame for the disease. 
Other erroneous suggestions followed: in 1846, 
for example, Scottish toxicologist Robert 
Christison hypothesized that scurvy was 
caused by protein deficiency. Scurvy contin- 
ued to be a sporadic problem into the early 
twentieth century. It was not until 1932 that 


NUTRIGENOMICS pReleyymelele 


US biochemist Charles Glen King showed 
that scurvy was caused by a deficiency of the 
newly discovered vitamin C. 

Animal research led to further vitamin and 
disease-related discoveries. US biochemist 
Elmer Verner McCollum learned German 
so he could read the works of past nutrition 
researchers, which inspired him to experi- 
ment on rats. At the University of Wisconsin, 
where McCollum initially worked, research 
protocols stipulated the use of cows as animal 
models. But McCollum convinced his superi- 
ors to let him try smaller animals. He bought 
12 albino rats from a pet store and established 
the first colony of rats for nutritional experi- 
mentation in the United States. In 1913, his 
studies with these rats led him to identify 
the first fat-soluble vitamin, vitamin A, and 
later showed that it is vitamin D — and not 
vitamin A as some thought — that prevents 
rickets. 

Proving the link between micronutrients 
and disease didn’t come easily. US Public 
Health Service worker and epidemiologist 
Joseph Goldberger theorized that pellagra, 
then a major disease causing diarrhoea, der- 
matitis, dementia and death, was diet-related 
and not, as prevailing medical opinion 
held at the time, an infectious disease. 
In 1916, to prove his point, Goldberger 
and his assistant subjected themselves 
to a series of tests — they injected 
each other with blood from a pellagra 
sufferer, swabbed out the secretions 
of an pellagra-infected person’s nose 
and throat and rubbed them into their 
own, and swallowed capsules contain- 
ing scabs of pellagra sufferers’ rashes. 
And yet despite such gross expo- 
sure, they did not develop pellagra. 
However, Goldberger was unable to 
find the diet-related cause. It was 

another two decades before American 
biochemist Conrad Elvehjem realized 
that pellagra was caused by a deficiency of 
niacin (vitamin B3). 

So many micronutrients had been discov- 
ered by 1944 that some believed the field of 
nutrition had been fully defined with little 
else to discover. But while the constituent 
parts of food might have been teased out, their 
impact on the body was only starting to be 
appreciated. 

From Lavoisier, through von Liebig, to 
scientists today such as Jose Ordovas (see 
Big science at the table, page S2), nutrition 
research has focused on smaller and smaller 
elements. As scientists have probed deeper into 
biochemical mechanisms of bodily absorption 
and function — unlocking mysteries as they 
go — they have also triggered new questions, 
until we get to ‘how do our genes interact with 
the food we eat?’ And that’s the question we 
are still trying to answer today. = 


Ned Stafford is a science writer in Hamburg. 
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More than meets 
the mouth 


Certain things taste differently to different people. Why is 
this, and does this affect our choice of food? 


BY MICHAEL EISENSTEIN 


early 80 years after DuPont chemists 
Neexses across evidence of genetic 
variation in perception of the bitter- 
tasting compound phenylthiocarbamide (PTC), 
Danielle Reed’s team at the Monell Chemical 
Senses Center in Philadelphia, Pennsylvania, 
made a similarly serendipitous discovery. 
Reed was approached by a lab technician 
worried she made a mistake with a experi- 
mental quinine preparation. “She said, ‘I think 
I made the solutions wrong — here, taste this?” 
recalls Reed, who then tasted the bitter com- 
pound. “I'm like, ‘ugh, it seems fine to me’ But 
she said, ‘It tastes like water to me:” 
This strange observation eventually led to 
the discovery of a genetic locus that affects 


our tongue’s ability to detect bitterness in 
quinine — a big step on the road to under- 
standing how people differ from one another 
in terms of taste, and how these differences 
shape what we like to eat. 


A BITTER TASTE 
Bitter is one of the five primary tastes — along 
with sweet, sour, salty and the savoury umami 
— that compose the gustatory system. Of these, 
bitter is perhaps the best characterized in terms 
of the influence of genetic variability on taste. 
In humans, the cells responsible for bitter 
taste perception express 25 receptors (T2Rs) 
that vary in the chemicals they recognize but 
which appear to perform a common role in 
preventing people from eating toxic com- 
pounds. Accordingly, some scientists are 


S18 | NATURE | VOL 468 | 23/30 DECEMBER 2010 


convinced that humans evolved taste to detect 
harmful substances. “A newborn baby is born 
loving sweet and hating bitter — no experi- 
ence required,” says Linda Bartoshuk, director 
of human research at the Center for Smell and 
Taste at the University of Florida. 

Insensitive variants have been identified for 
several bitter receptor genes and are common 
in the general population. For example, muta- 
tions in T2R38 render individuals incapa- 
ble of tasting PTC or the related compound 
6-n-propylthiouracil (PROP). 

Such limited sensitivity can be an asset as 
many nutritious vegetables, including broc- 
coli and sprouts, also produce bitter tasting 
glucosinolates. These compounds include goi- 
trin, a thyroid toxin in large doses but which 
may protect against cancer in lower doses. 

There are obvious nutritional advantages 
in mitigating the urge to avoid sprouts, and 
the adaptive value of this reduced sens- 
itivity allele is evident in its global distribution 
alongside the more common sensitive version. 
“The ratio of the alleles varies depending on 
where you go,’ says Paul Breslin, a taste per- 
ception researcher at Monell, “but you see that 
both have been maintained in almost every 
population you look at anywhere on Earth.” 

Yet efforts to firmly link individual genetic 
variations with altered food preferences have 
not been easy. Several studies have revealed 
geographic or ethnic differences in the dis- 
tribution of taste receptor variants that may 
have arisen from selective pressures (see Of 
beans and genes, page S13), but their effects 
on diet — and association with overall health 
— are controversial. “’m a PTC non-taster: I 
can't taste goitrin in vegetables very well. But I 
think this has very little to do with how much 
broccoli I choose to eat on a daily basis,” says 
Reed. 

Attempts to establish similar correlations 
between disease and taste have proven equally 
problematic. For example, there is no clear link 
between sensitivity to sweet tastes and predis- 
position to obesity, diabetes or other diseases 
related to excess consumption of sugars. 

Some of the strongest connections identi- 
fied relate to alcohol preference. In one study, 
Bartoshuk partnered with Yale University 
geneticist Ken Kidd to examine how bitter 
taste shapes alcohol perception within a cohort 
of students. “There was a clear relationship 
between sensitivity and whether ethanol is 
perceived as bitter and harsh or slightly 
sweet,’ says Kidd. “Among those who were 
homozygous for the high-sensitivity [bitterness 
allele], nobody drank very much.” Other stud- 
ies at Monell have hinted at a parallel role for 
sweetness receptor variation, where sensitiv- 

ity to, and preference for, 
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must be considered alongside the numerous 
other brain and metabolic factors involved in 
drinking alcohol. 

Collectively, these data raise a question: 
given the front-line role of taste perception in 
food consumption, and the clear advantages 
of quickly recognizing good and bad food 
sources, why is it so hard to associate genetic 
differences in taste function with dietary 
behaviour? 


NAME THAT TASTE 

A large part of this problem arises from chal- 
lenges in experimentally linking the highly 
subjective experience of taste with biologi- 
cal mechanisms. But gaps also remain in our 
understanding of the basic machinery of taste 
perception. This past spring, Charles Zuker’s 
team at Columbia University, New York, vali- 
dated the involvement of epithelial sodium 
channel ENaC as a component of sodium 
chloride salt perception in mice. Other salt 
receptors remain at large. “People describe 
potassium chloride as being kind of brackish 
tasting, maybe kind of metallic, like a dirty 
salt solution. It’s clearly salty,” says Breslin. 
“That can’t be through an ENaC, because 
those channels pass potassium ions very 
poorly.” 


Sour 


things that are uniquely found in those [other] 
cells.” 


AGUT FEELING 
Taste doesn’t end at the back of the tongue. 
Many of the same taste receptor genes 
expressed in taste buds are expressed through- 
out the digestive system and in other tissues. 
Preliminary investigations suggest that these 
non-oral receptors help regulate appetite and 
metabolism. “What better way to do so than 
having the very same receptors reporting back 
from the gastrointestinal tract?” asks Zuker. 
There is already strong evidence that taste 
receptors in the mouth help steer organisms 
towards the nutrients that the body needs 
most. “If you offer malnourished kids soups 
that are either plain, ordinary stocks or stocks 
that have been fortified, they generally prefer 
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Known and suggested taste qualities. 
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absorption from the blood. Munger adds 
that his own investigations of genes associ- 
ated with diabetes among Amish people have 
been confounded by these gut receptors and 
the ambiguity of their function. “We did see 
an association with variation in a particular 
bitter receptor and the ability of non-diabetic 
individuals to regulate their blood glucose,” he 
says. However, it remains unclear whether this 
association arises from the effects of receptor 
variation on tongue-level taste preference and 
food selection or whether the difference lies in 
how the gut reacts to particular foods. 


UNWIRING FLAVOUR 
The biggest outstanding issue for many taste 
scientists is understanding how the various 
raw chemical sensations that transmit taste are 
incorporated into a more nuanced and sophis- 
ticated sense of flavour. Perception at this level 
also depends on signals received by sense of 
smell, which exhibits far greater complexity, 
environmental adaptability and personal vari- 
ation. “You've got one sense, taste, that’s hard- 
wired for affect,” says Bartoshuk, “and another, 
smell, where the affect is extremely labile and 
learned very quickly and can also be extin- 

guished” 

Equally important is how the brain 


Furthermore, even though research- ce ee decides whether or not it likes what 
ers have known the cells responsible unknown it senses. Alexander Bachmanoy, a 
for sour taste since 2006, a defini- others geneticist at Monell, cites the exam- 


tive receptor has yet to be identified. 
This is partly because of the complex 
nature of oral response to acid, where 
taste effects overlap with somatosen- 
sory sensations, a category of percep- 
tual information that encompasses 
non-taste qualities such as temperature, 
texture or spiciness. 

Preliminary reports also hint at addi- 
tional taste qualities, enabling the tongue to 
recognize things like fatty acids or calcium. 
But there is little consensus on this, in part 
because no dedicated taste-quality cells have 
been identified and also because candidate 
receptors only partially account for our ability 
to distinguish these putative tastes. Some sci- 
entists are also sceptical because humans lack 
a lexicon to describe these qualities. “Just take 
alittle canola oil and taste it — it doesn't really 
have a taste,” says Bartoshuk. “My guess is that 
the real sensory input from fat is tactile — fat 
is gooey and oily and viscous and creamy.’ 

Most investigators remain open to the pos- 
sibility that there’s more to the mouth than just 
the ‘basic five. A 2009 study by Zuker’s team 
identified a protein expressed in sour cells that 
apparently contributes — in conjunction with 
somatosensory receptors — to the discrimi- 
nation of a ‘carbonation taste; and they are on 
the hunt for mechanisms that monitor other 
undiscovered qualities. “If you take an animal 
and label all the sweet, sour, bitter, salty and 
umami cells, there are still plenty of cells left,” 
he says. “What we're doing now is looking for 
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soups that are amino acid-fortified over eve- 
rything else, including very tasty high-calorie 
soups,’ says Breslin. “This is in young kids, 
who have no idea what’s going on. This sug- 
gests that somehow there's this ‘wisdom of the 
body” 

Evidence suggests that at least some of this 
activity may arise from metabolic signals trig- 
gered by taste receptor activation. “Taste cells 
express all sorts of different peptide hormones 
that are used in other areas of the body for reg- 
ulating satiety or blood glucose,” says Steven 
Munger, a neurobiologist at the University of 
Maryland. 

Several studies in the past few years suggest 
that these receptors also direct the secretion 
of metabolic hormones in the lower digestive 
tract in response to sweet, bitter or umami 
stimuli; for example, intestinal sweetness 
receptor signalling may help regulate glucose 


ple of sweet-liking mice developed in 
his lab. “Through selective breeding, 
we have created mice with the same 
genotype for sweet taste receptors, 
but some are very avid consumers of 
sweeteners while others consume them 
in very modest amounts,” he says, and 
suggests that this behaviour arises from 
variations in more central neurological 
mechanisms related to taste response. This 
added complexity leaves a lot of room for cul- 
tural influences and environmental factors to 
shape how we assign reward value to a flavour 
and might in turn affect the contribution of 
more subtle genome-level factors. As such, 
inherited differences in taste receptor expres- 
sion or function alone are probably insuffi- 
cient to explain how many of us overcome our 
innate aversion to bitterness and sourness to 
thoroughly enjoy a steaming demitasse of 
espresso or a bracing gin and tonic. 
Nevertheless, there is evidence that genetic 
changes can modulate the response of this 
normally hard-wired sensory system. Zuker 
concludes that meaningful progress in untan- 
gling the neural processes behind food choice 
will require a solid understanding of what 
happens when meal meets mouth. “Before 
we can understand how the brain knows,’ he 
says, “we need to figure out how the tongue 
knows. = 


Michael Eisenstein is a journalist in 
Philadelphia. 
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Children wait to be fed during the Dutch Hongerwinter of 1944-1945. 


Tales of adversity 


Genetic studies of people conceived during famine reveals 
that prenatal malnutrition lingers long after the event. 


BY FAROOQ AHMED 


womans habits affect the health of her 

unborn child, but the extent of the impact 
is less well known. Recent studies of tragic 
historical events, namely the Dutch Honger- 
winter and the Great Chinese Famine, have 
begun to highlight the trans-generational 
relationship between food and genes. 

The Hongerwinter (hunger winter) began 
late in 1944 towards the end of the Second 
World War. Food supplies in the northern 
and western regions of Nazi-occupied 
Holland became increasingly limited as the 
Germans halted overland transport of goods 
into Amsterdam and nearby cities. 

Exacerbating this blockade, the harsh 
winter froze canals — cutting off a vital sup- 
ply route. Rations in cities dropped to as few 
as 500 calories per day, less than a quarter of 
the recommended intake, until the country 
was liberated in May 1945, but not before 
18,000 people starved to death. 

Many children conceived during the 
Hongerwinter were small and underweight. 
What's more, certain health problems have 
persisted long into their adult lives. Compared 
to their siblings conceived before or after the 
famine, the Hongerwinter children are 
at increased risk for obesity, for example. 


|: is well established that a pregnant 


A propensity for obesity was also found in 
children of the 1968-1970 Biafra famine in a 
recent study in Nigeria. 

The Great Chinese Famine, from 1958 to 
1961, was caused by a combination of leader 
Mao Zedong’s agricultural policies during 
the Great Leap Forward, widespread misman- 
agement and severe weather. Tens of mil- 
lions of people died. Studies of Chinese born 
during this period link prenatal famine expo- 
sure to an increased risk of schizophrenia — a 
link also found in the Dutch Hongerwinter 
cohort. 

“These extreme events offer special oppor- 
tunities for research in humans that you might 
not otherwise have,” says Lambert Lumey, 
an epidemiologist at Columbia University, 
New York, who is studying the effects of the 
Hongerwinter. There are obvious ethical issues 
and long time spans involved that make recre- 
ating the circumstances of famine impossible. 
“These events are crucial to helping us develop 
and discover underlying disease mechanisms,” 
says Lumey. 


TELL-TALE DNA 

Scientists have discovered that certain genes 
of children conceived during a prolonged 
period of starvation receive special epigenetic 


gene, but does not alter the genetic code. 
Methylation is part of normal development, 
but patterns vary across individuals. 

Nearly six decades after the famine, 
Lumey and colleagues isolated DNA from 
Hongerwinter individuals. They found a 
below-average methylation of the insulin-like 
growth factor II gene (IGF2), which codes 
for a growth hormone critical to gestation. 
Decreasing the methylation of IGF2 should 
increase the expression of the hormone. In 
contrast, later studies in this cohort found 
increased methylation of five other genes, 
among them genes associated with choles- 
terol transport and ageing, as well as the 
gene that produces IL-10, which has been 
linked with schizophrenia. 

The mechanisms of these epigenetic 
changes and whether they have a bearing on 
disease remain unclear. “In humans, these are 
the $100,000 questions,” says epigeneticist 
Robert Waterland from Baylor College of 
Medicine in Texas. 

Lumey hopes to study the children of the 
‘tagged’ individuals to see if these changes 
persist into the next generation. Epigenetic 
information is almost fully reset in very early 
development, so the outcome, he says, is dif- 
ficult to predict. “This is an important question 
regardless of what the data will later show” 

Nevertheless, studies on these extreme 
events “provide the first convincing evi- 
dence that early nutritional exposure causes 
a persistent change in epigenetic regulation 
in humans,” notes Waterland. “It’s a proof of 
principle” 

Lumey is now looking to high-through- 
put sequencing methods to measure genome- 
wide DNA methylation. “We expect that 
this will tell us whether there also are more 
epigenetic differences between prenatally 
exposed individuals and their unexposed sib- 
lings, than the ones we found studying candi- 
date loci,” says epigeneticist Bastiaan Heijmans 
of Leiden University in the Netherlands, who 
works with Lumey. If these modifications are 
indeed widespread throughout the genome, 
the cumulative effect of famine-induced 
epigenetic alterations might play a substantial 
role in disease progression. 

Other research has shown that less-extreme 
diets also affect methylation patterns and dis- 
ease susceptibility. For example, folic acid is an 
important supplement for pregnant women 
to help prevent neural tube defects in devel- 
oping embryos. It has been shown to increase 
the methylation of JGF2, hinting that it works 
through an epigenetic mechanism. 

Nevertheless, studying such catastrophes 
provides researchers with valuable informa- 
tion that is not otherwise available, revealing 
that the aftermath of famine and prenatal mal- 
nutrition lasts long after help arrives with life- 
saving food. = 


‘tags’ through a process called methylation — a 
gene modification that typically deactivates a 
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A flavour of the future 
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Health biomarkers, smart technology and social networks are hastening an era of nutrition 
tailored to your individual needs but relying on information generated by the crowd. 


man steps out of a health clinic after 
A monthly nutritional profile. He 
slides a ring onto his finger and the 
injection-free technology transmits a read-out 
of his blood constituents to a central server. 
Skimming the data sent to his smart phone, 
he looks at the recommendation for his evening 
snack — something with a little more selenium: 
brazil nuts, perhaps. He considers his diet 
for the coming week — logged with his 
refrigerator — and confirms an updated home- 
delivery shopping list. Finally, he tots up his 
credits for sharing this personal health data with 
a population-wide genome study—redeemable 
against the cost of his health insurance and 
nutritional supplements. It’s a familiar sight to 
his girlfriend. “We're having dinner at my par- 
ents’ tomorrow. Dont you dare let the FatNav 
tell you what to eat, or me what to drink” 
There are signs that this future is fast 
approaching. Domestic sleep and weight moni- 
tors can transmit results using WiFi; fridges are 
in development that log what you've eaten; and 
dinner parties are complicated by food intoler- 
ance and fad diets. Already, pin-prick blood 


test results for diabetes can be uploaded online. 
Websites such as patientslikeme.org offer tips 
on drug and nutritional supplement regimens. 
And at SNPedia.com and DIYGenomics.org, 
people can share their entire genomic data to 
pool resources and provide more personal 
guidance on health issues. 

Can all these platforms create genetics-based 
nutrition advice? Will this affect our definition 
of health, or the distinction between food and 
drugs? And how personalized will our diets 
become? 


NOT IN SICKNESS BUT IN HEALTH 
Many researchers think that personalized 
nutrition must begin with a new suite of 
biomarkers: ones that measure health rather 
than disease. But what does that mean? “Here 
we are in the twenty-first century and we don't 
have a definition of health other than ‘the 
absence of disease?’ says Sian Astley, a nutrition 
researcher at the Institute of Food Research, 
UK. “Health is about much more.” 

Astley says that to comprehend what bio- 
active food compounds are doing we first have 


to understand what's going on in the body 
before it becomes ill. “Our difficulty is that the 
only biomarkers we have are for when the dis- 
ease process has already started.” 

‘Omics’ sciences, such as transcriptomics, 
proteomics and metabolomics, study many 
thousands of putative biomarkers in a proc- 
ess called ‘extensive phenotyping. “We now 
have examples where the protein finger- 
print in tissues can indicate precancerous 
changes long before symptoms appear,” says 
Astley. “The protein fingerprint offers us early 
diagnosis as well as an insight into potential 
changes that might be elicited by feeding peo- 
plea different diet” 

Astley also works for the Nutrigenomics 
Organisation (NuGO), an EU-funded project 
involving 23 universities and research insti- 
tutes. NuGO researchers believe that to find 
these health biomarkers, testing conditions 
will need a rethink. For example, although we 
are all ina state of homeostatic equilibrium, 
the ‘normal levels of metabolites, including 
glucose, plasma proteins, cytokines and sig- 
nalling molecules, vary from person to > 
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> person. Challenging that state with exercise 
or new foods, and then measuring changes in 
metabolites as the body recovers, reveals more 
about its reaction to bioactive compounds 
than simply measuring metabolites in a rest- 
ing state. 


THE DEVIL IN THE DETAILS 

Extensive phenotyping is a big job and costs 
big money. Resource-limited researchers have 
two options: measure many people in lesser 
detail, or a smaller number in greater detail. 
Large population studies have more statistical 
power, but as the ultimate goal is personalized 
nutrition, an investigation of the individual 
will provide more in-depth information. 

It’s a conundrum facing Mike Gibney, direc- 
tor of University College Dublin’s Institute 
of Food and Health. “Too many people in a 
study smooths out the data and is too expen- 
sive in an era when so many measurements 
are needed,” he says. Gibney contends we are 
in transition towards personalized nutrition 
and advocates temporarily abandoning the 
‘individual mantra. Instead, people should be 

grouped into broader 


“Finally we categories based 
have the proofin on biomarkers that 
the pudding = indicate, for exam- 
gene Heverie ty ple, how efficiently 
in dietary advice different sugars or 


proteins are metabo- 
lized. “'m taking my 
research in the direc- 
tion of clusters,” he says. “I believe 
it’s a half-way house.” These wider groupings 
have the advantages of consisting of larger 
populations and can act as a proof of concept. 

Results are emerging that support the 
notion of these clusters. Kenneth Kornman 
is founder of InterLeukin Genetics (ILG), a 
Massachusetts-based company developing 
tests for genes that affect food metabolism 
based on single nucleotide polymorphisms 
(SNPs). Kornman recently reanalysed a 2007 
study by Christopher Gardner and colleagues 
at Stanford University. In Gardner’s study, 
311 women were randomized to four differ- 
ent diets, which varied in the content of car- 
bohydrates. After 12 months, women on the 
low-carb, high-protein Atkins diet had lost 
the most weight. 

Kornman’s reanalysis involved placing 101 
of the women (those available for the follow-up 
study) into one of three groups categorized by 
three SNPs related to the metabolism of dietary 
fats and carbohydrates. Women in the ‘fat- 
sensitive’ group shared a SNP that meant they 
gained more weight from a high-fat diet than 
did women in the ‘carbohydrate-sensitive’ 
group, and vice versa. The third group was 
sensitive to neither fat or carbohydrate. “Our 
company screened the published evidence on 
more than 200 SNPs and determined that these 
three were the only ones that met our criteria,” 
says Kornman. The criteria were that each SNP 


is relevant.” 


should have at least three validating clinical 
studies, should be functional (directly linked 
to biological or clinical effects) and linked to 
body weight. 

Kornman found that women on a diet that 
matched their genotype lost two-to-three times 
more weight than those on an unmatched diet. 
The study, sponsored by ILG, was presented 
at the 2010 Joint Conference of the American 
Heart Association in San Francisco. “The sci- 
entists in the audience were shocked,” recalls 
Ben van Ommen, director at the Netherlands 
Organisation for Applied Scientific Research 
and NuGO, who had invited Kornman to 
speak. Kornman, he says, “has been scruti- 
nized by the audience and he’ survived. Finally 
we have the proof in the pudding — genetic 
variety in dietary advice is relevant”. 


FOOD TRIBES 

Moving towards the more personalized end 
of the nutrition spectrum will require mil- 
lions more data points from many diverse 
groups. One way to collect information from 
disparate populations is to use crowd-sourc- 
ing technologies. Many people who have 
discovered some or all of their genetic infor- 
mation are sharing or offering it for analysis 
using websites such as SNPedia, DIYgenom- 
ics and Harvard Medical School's Personal 
Genome Project. As genome testing becomes 
cheaper, more data will become available to 
use in this way. 

Founders of personal genome informa- 
tion-sharing websites, such as DIYGenom- 
ics’ Melanie Swan, say they can facilitate this 
data-gathering process by offering a new way 
to conduct science that appeals to the subjects. 
“We aim to give individuals the opportunity to 
participate in citizen science research studies,” 
says Swan. “The whole point is to experiment 
and find out what works best for you.” 

A typical experiment might investigate 
vitamin supplements and their precursors. 
Participants would consent to taking regu- 
lar supplements, pay for their own genetic 
sequencing test, submit regular tests to an 
approved laboratory, and upload results to the 
website. Combining data from all participants 
paints a picture of the relationship between 
certain genes and the impact of a vitamin or 
vitamin precursor on health. DIYGenom- 
ics’ first study — submitted to a peer-review 
journal — is a proof-of-concept, extending 
existing research on gene mutations and vita- 
min B deficiency. Another study on ageing is 
designed and set to recruit participants. 

This new approach to research blurs the 
distinction between study organizer and par- 
ticipant. “We all design the study and we all 
participate. We have 
our own consenting 
process too,’ says Swan, 
adding that she sees a 
‘citizen ethicist’ version 
of the Hippocratic oath 
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evolving to accommodate new ways of con- 
ducting research. 

Some people see personal genomics as a log- 
ical follow-on to social networking and a valu- 
able asset. “There is definitely potential in a 
citizen science approach,’ says Marina Levina, 
a communication researcher at The University 
of Memphis. Levina, however, adds a few cave- 
ats. “Citizen science implies that conventional 
science has failed us in some ways, whereas I 
would argue that guidelines and restrictions 
that perhaps slow down conventional science 
are there because of valid ethical issues.” 

There are other potential pitfalls. Genetic 
testing companies that provide genome- 
sharing websites have been criticized 
for offering inconsistent results and flimsy 
diagnoses regarding genetic propensity to 
disease. There are signs that the US Food 
and Drug Administration is moving to clip 
their wings, perhaps by enforcing tougher 
regulation. This echoes ongoing changes to 
regulation of the nutritional supplements 
industry in the United States and Europe, 
which is to be treated more like the pharma- 
ceutical industry. 

Genes are not the only important consid- 
erations when developing tailored nutritional 
advice. The nascent science of epigenetics, 
which describes how and when genes are 
turned on and off in the body, promises to both 
complicate and frustrate the road to personal- 
ized nutrition. 

ILG’s Kornman says epigenetics is the ele- 
phant in the room when it comes to determin- 
ing optimal diet: “There is growing evidence 
that prenatal nutrition and environmental 
effects have a life long and maybe multi- 
generational effect in terms of fetal develop- 
ment and early childhood nutrition.” Even 
if we can decode the genetic recipe of the 
diet-health relationship, without a greater 
knowledge of the epigenetic modifications 
put in place early in life — or in a mother’s or 
perhaps grandmother’s life — this recipe still 
might not taste right. 

What's more, can we ever over-ride our love 
for sweet, fatty and salty food? “People are per- 
verse about dietary choice,” says Tom Sanders, 
head of nutrition and dietetics at King’s College 
London. “They tend to offset what they per- 
ceive as good food with bad food.” Put another 
way, we are bad at eating good food, and good 
at eating bad food. 

Nutrigenomics may well change our defini- 
tion of health and disease; blur the distinction 
between food and drugs; between experi- 
menter and experimentee; and demonstrate 
new models of the scientific method driven by 
food tribes, citizen scientists and online social 
networks. The paradox is that as our lifestyles 
become ever more individualized, it could 
be the crowd that delivers the best advice for 
healthy eating. = 


Arran Frood is a freelance writer in the UK. 


